Closed geraldstanje closed 3 months ago
Maybe we choose change the way the opset is set by the converter but if you selected target_opset=14 and the output model has opset=13, it means the model is valid for opset=14 with no change. You can use onnx api to change its value to 14. See https://onnx.ai/onnx/intro/python.html#opset-and-metadata.
hi @xadupre thanks for your reply. do you see how the setfit package is using it? https://github.com/huggingface/setfit/blob/main/src/setfit/exporters/onnx.py#L251-L264
max_opset = max([x.version for x in onnx_head.opset_import])
- it will only use opset 13 for max_opsetdoes that make sense?
I'm not sure what you are asking. The fix would make better sense in setfit package but no in sklearn-onnx. Maybe I missed something? The instruction max_opset = max([x.version for x in onnx_head.opset_import])
could potentially lead to a bug. It returns the highest version whatever the domain is. It should be something like [x.version for x in onnx_head.opset_import if x.domain == ''][0]
.
@xadupre what i mean is: if i want opset=40 (in https://github.com/huggingface/setfit/blob/main/src/setfit/exporters/onnx.py#L186) for example and model_head (which uses sklearn-onnx) returns opset 13 -> the max_opset is 13 only - which means i cannot set the model_body to opset 40 because it will also use opset 13... does that make sense?
You can change the opset after the conversion happened by running the following snippet of code which changes the opset version for the main domain. Changing the behaviour in sklearn-onnx is not difficult but I would prefer not to make it the default right now.
opsets = list(op for op in proto.opset_import if op.domain == "")
opsets.append(make_opsetid("", 14))
del proto.opset_import
proto.opset_import.extend(opsets)
@xadupre
max_opset = max([x.version for x in onnx_head.opset_import])
also returns opset=14?
opsets = list(op for op in proto.opset_import if op.domain == "")
opsets.append(make_opsetid("", 14))
del proto.opset_import
proto.opset_import.extend(opsets)
max_opset = max([x.version for x in onnx_head.opset_import])
works most of the time but the logic is wrong. It returns a wrong result if one of the opset version is higher than the one for the main domain. The fix I suggest can be done anywhere:
def force_main_opset(proto: ModelProto, new_version:int):
opsets = list(op for op in proto.opset_import if op.domain == "")
opsets.append(make_opsetid("", new_version))
del proto.opset_import
proto.opset_import.extend(opsets)
You can call this function anywhere once the conversion is done. It does not check the version is consistent. It assumes the user knows it is. I don't plan to make this change in sklearn-onnx as it changes the default behaviour and it is better to be cautious in this case. We can make it the default in a couple of releases if we add a warning telling the users this behaviour will be soon the default.
@xadupre thanks for all the infos.
can i call it as follows?
target_opset = 40
onnx_head = export_sklearn_head_to_onnx(model.model_head, opset=target_opset)
force_main_opset(onnx_head, target_opset)
and than max_opset should be 40?
max_opset = max([x.version for x in onnx_head.opset_import])
if max_opset != opset:
warnings.warn(
f"sklearn onnx max opset is {max_opset} requested opset {opset} using opset {max_opset} for compatibility."
)
export_onnx_setfit_model(
OnnxSetFitModel(transformer, lambda x: model_pooler(x)["sentence_embedding"]),
dummy_inputs,
output_path,
max_opset,
)
It should work.
or still change to: max_opset = [x.version for x in onnx_head.opset_import if x.domain == ''][0]
?
The second one is correct as long as the main opset is used in the model but if the second expression fails, then the first one would return something wrong anyway. You should definitly replace it.
The second one is correct as long as the main opset is used in the model but if the second expression fails, then the first one would return something wrong anyway. You should definitly replace it.
do you mean as long it has a opset called domain?
but when i look at max_opset = max([x.version for x in onnx_head.opset_import])
- it goes over all opsets and finds the maximum? why you only want to look at x.domain == ''?
[convert_sklearn] parse_sklearn_model
[convert_sklearn] convert_topology
[convert_operators] begin
[convert_operators] iteration 1 - n_vars=0 n_ops=2
[call_converter] call converter for 'SklearnCastTransformer'.
[call_converter] call converter for 'SklearnLinearClassifier'.
[convert_operators] end iter: 1 - n_vars=12
[convert_operators] iteration 2 - n_vars=12 n_ops=2
[convert_operators] end iter: 2 - n_vars=12
[convert_operators] end.
[_update_domain_version] +opset 0: name='', version=13
[_update_domain_version] +opset 1: name='ai.onnx.ml', version=1
[convert_sklearn] end
ONNX opset version used: 13
had to do some edits to the code - had to replace del proto.opset_import with proto.ClearField('opset_import') - looks good?
$ pip list:
------------------------ -----------
aiohttp 3.9.5
aiosignal 1.3.1
async-timeout 4.0.3
attrs 23.2.0
certifi 2024.2.2
charset-normalizer 3.3.2
datasets 2.19.1
dill 0.3.8
evaluate 0.4.2
filelock 3.14.0
frozenlist 1.4.1
fsspec 2024.3.1
huggingface-hub 0.23.2
idna 3.7
Jinja2 3.1.4
joblib 1.4.2
MarkupSafe 2.1.5
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.2.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.5.40
nvidia-nvtx-cu12 12.1.105
onnx 1.16.1
onnxconverter-common 1.14.0
packaging 24.0
pandas 2.2.2
pillow 10.3.0
pip 22.0.4
protobuf 3.20.2
pyarrow 16.1.0
pyarrow-hotfix 0.6
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
regex 2024.5.15
requests 2.32.2
safetensors 0.4.3
scikit-learn 1.3.2
scipy 1.13.1
sentence-transformers 2.7.0
setfit 1.0.3
setuptools 58.1.0
six 1.16.0
skl2onnx 1.16.0
sympy 1.12
threadpoolctl 3.5.0
tokenizers 0.19.1
torch 2.3.0
tqdm 4.66.4
transformers 4.41.1
triton 2.3.0
typing_extensions 4.12.0
tzdata 2024.1
urllib3 2.2.1
xxhash 3.4.1
yarl 1.9.4
code:
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
from sklearn.linear_model import LogisticRegression
import onnx
from setfit import SetFitModel
# scikit-learn 1.3.2
import numpy as np
from onnx.helper import make_opsetid
def force_main_opset(proto: onnx.onnx_ml_pb2.ModelProto, new_version:int): opsets = list(op for op in proto.opset_import if op.domain == "") opsets.append(make_opsetid("", new_version))
proto.ClearField('opset_import')
proto.opset_import.extend(opsets)
def export_sklearn_head_to_onnx(model_head: LogisticRegression, opset: int) -> onnx.onnx_ml_pb2.ModelProto: """Convert the Scikit-Learn head from a SetFitModel to ONNX format.
Args:
model_head (`LogisticRegression`): The trained SetFit model_head.
opset (`int`): The ONNX opset to use for optimizing this model. The opset is not
guaranteed and will default to the maximum version possible for the sklearn
model.
Returns:
[`onnx.onnx_ml_pb2.ModelProto`] The ONNX model generated from the sklearn head.
Raises:
ImportError: If `skl2onnx` is not installed an error will be raised asking
to install this package.
"""
# Check if skl2onnx is installed
try:
import onnxconverter_common
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import guess_data_type
from skl2onnx.sklapi import CastTransformer
from sklearn.pipeline import Pipeline
except ImportError:
msg = """
`skl2onnx` must be installed in order to convert a model with an sklearn head.
Please install with `pip install skl2onnx`.
"""
raise ImportError(msg)
# Determine the initial type and the shape of the output.
input_shape = (None, model_head.n_features_in_)
if hasattr(model_head, "coef_"):
dtype = guess_data_type(model_head.coef_, shape=input_shape)[0][1]
elif not hasattr(model_head, "coef_") and hasattr(model_head, "estimators_"):
if any([not hasattr(e, "coef_") for e in model_head.estimators_]):
raise ValueError(
"The model_head is a meta-estimator but not all of the estimators have a coef_ attribute."
)
dtype = guess_data_type(model_head.estimators_[0].coef_, shape=input_shape)[0][1]
else:
raise ValueError(
"The model_head either does not have a coef_ attribute or some estimators in model_head.estimators_ do not have a coef_ attribute. Conversion to ONNX only supports these cases."
)
dtype.shape = input_shape
# If the datatype of the model is double we need to cast the outputs
# from the setfit model to doubles for compatibility inside of ONNX.
if isinstance(dtype, onnxconverter_common.data_types.DoubleTensorType):
sklearn_model = Pipeline([("castdouble", CastTransformer(dtype=np.double)), ("head", model_head)])
else:
sklearn_model = model_head
# Convert sklearn head into ONNX format
onnx_model = convert_sklearn(
sklearn_model,
initial_types=[("model_head", dtype)],
target_opset=opset,
options={id(sklearn_model): {"zipmap": False}},
verbose=True,
)
return onnx_model
model = SetFitModel.from_pretrained("../export_onnx/model_to_deploy")
target_opset = 14
onnx_head = export_sklearn_head_to_onnx(model.model_head, opset=target_opset) print("ONNX opset version used:", onnx_head.opset_import[0].version) force_main_opset(onnx_head, target_opset)
max_opset = max([x.version for x in onnx_head.opset_import]) print("max_opset:", max_opset)
with open("logistic_regression.onnx", "wb") as f: f.write(onnx_head.SerializeToString())
- call python and see output:
$ python main.py [convert_sklearn] parse_sklearn_model [convert_sklearn] convert_topology [convert_operators] begin [convert_operators] iteration 1 - n_vars=0 n_ops=2 [call_converter] call converter for 'SklearnCastTransformer'. [call_converter] call converter for 'SklearnLinearClassifier'. [convert_operators] end iter: 1 - n_vars=16 [convert_operators] iteration 2 - n_vars=16 n_ops=2 [convert_operators] end iter: 2 - n_vars=16 [convert_operators] end. [_update_domain_version] +opset 0: name='', version=13 [_update_domain_version] +opset 1: name='ai.onnx.ml', version=1 [convert_sklearn] end ONNX opset version used: 13 max_opset: 14
hi @xadupre
i convert the pytorch model to onnx (model body uses torch.export, model head uses convert_sklearn). i validated the onnx model accuracy and recognized that i loose 4% of the accuracy - is that expected?
here are the onnx export logs: pytorch_onnx.txt
when i load the model to onnx i see this warning - what does that mean?
2024-06-02 04:06:32.892378766 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 1 Memcpy nodes are added to the graph main_graph_92a4c27765294ee2ac3f1a4dc236c9f8 for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message.
2024-06-02 04:06:32.893918364 [W:onnxruntime:, session_state.cc:1166 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-06-02 04:06:32.893939696 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments
onnx model loading code:
# Load the ONNX model
onnx_model_path = output_path #'sklearn_model.onnx'
session = onnxruntime.InferenceSession(onnx_model_path, providers=['CUDAExecutionProvider'])
# Check if CUDA execution provider is available
providers = session.get_providers()
print("providers:", providers)
# Create a sample text to test on
sample_text = "How to create foo?"
max_length = model.model_body.max_seq_length
# Create the same embeddings using the ONNX model
inputs = model.model_body.tokenizer(
sample_text,
max_length=max_length,
padding="max_length",
truncation=True,
return_attention_mask=True,
return_token_type_ids=True,
return_tensors="np",
)
onnx_preds = session.run(None, dict(inputs))[0]
The warning says that onnxruntime introduced some cuda memcpy (from host to device or the other way) to be able to run the model. It happens sometimes when an operator has an implementation on CPU but not on CUDA. You can ask onnxruntime to save the optimized model to see where they were added by setting this parameter: https://onnxruntime.ai/docs/api/python/api_summary#onnxruntime.SessionOptions.optimized_model_filepath. About the accuracy, what do you mean by 4%? It is a small differences on every prediction or no difference in most cases and a some big differences?
@xadupre
i had to do a small edit of the code above (op.domain == "" changed to op.domain != "") - looks good now?
def force_main_opset(proto: onnx.onnx_ml_pb2.ModelProto, new_version: int):
opsets = list(op for op in proto.opset_import if op.domain != "")
opsets.append(make_opsetid("", new_version))
#del proto.opset_import
proto.ClearField('opset_import')
proto.opset_import.extend(opsets)
otherwize i see the following in the onnx model after calling force_main_opset:
...
opset_import {
domain: ""
version: 13
}
opset_import {
domain: ""
version: 14
}
About the accuracy, what do you mean by 4%? It is a small differences on every prediction or no difference in most cases and a some big differences?
i have a validation dataset to validate the pytorch and onnx model. the python model is 4% more accurate than the onnx model. the model has 2 classes for the output label (which is a string). onnx doesnt do any quantization by default or can you think of any reason i get this difference in accuracy?
4% is big. I saw that when the conversion replaced double by float32 for some model where a matrix was inverted (GaussianProcess for example). That's why sklearn-onnx was extended to support double for many models. Your model is a classifier, I assume the accuracy is measured by the number of good predictions. You should check if the probabilities are close or not when the converted model is failing to produce the expected predicted classes. I don't expect a logistic regression to introduce such discrepancies unless the coefficicents have very different scales. I would try first to ensure which part produces the discrepancies (scikit-learn or pytorch).
Encountered this issue too. Very, very annoying. If the parameter says target_opset=13
, why do I get opset equal to 9 as the minimum one supporting all the operators? The API is misleading.
Some libraries don't have mechanisms to handle lower opsets, unfortunately
Yeahhh, it's their problems. But maybe it's worth to add some optional boolean flag that says "force_opset_version=False" and document it well in the examples, so that new users won't have to struggle with an unfamiliar library and codebase for hours?
I think the change is very straightforward and non-breaking
The change I propose is to keep the old behaviour for convert_sklearn but to force the opset to be the one desired by the user for function to_onnx.
Fixed.
@xadupre thanks!
as far i understand i need to call skl2onnx.to_onnx after onnx_head = export_sklearn_head_to_onnx(model.model_head, opset=target_opset)
?
i currently have the following - can you tell me what i need to set for initial_types or other params skl2onnx.to_onnx has?
onnx_head = skl2onnx.to_onnx(model=onnx_head, target_opset=opset, verbose=2)
File "/usr/local/lib/python3.10/dist-packages/skl2onnx/convert.py", line 312, in to_onnx
initial_types = guess_initial_types(X, initial_types)
File "/usr/local/lib/python3.10/dist-packages/skl2onnx/algebra/type_helper.py", line 95, in guess_initial_types
raise NotImplementedError("Initial types must be specified.")
NotImplementedError: Initial types must be specified.
im not sure what initial_types should be based on:
[convert_sklearn] parse_sklearn_model
[convert_sklearn] convert_topology
[convert_operators] begin
[convert_operators] iteration 1 - n_vars=0 n_ops=2
[call_converter] call converter for 'SklearnCastTransformer'.
[call_converter] call converter for 'SklearnLinearClassifier'.
[convert_operators] end iter: 1 - n_vars=16
[convert_operators] iteration 2 - n_vars=16 n_ops=2
[convert_operators] end iter: 2 - n_vars=16
[convert_operators] end.
[_update_domain_version] +opset 0: name='', version=13
[_update_domain_version] +opset 1: name='ai.onnx.ml', version=1
[convert_sklearn] end
hi,
i use the following lib + function export_onnx - which converts a model head with sklearn and model body to onnx:
the model body requires opset 14 because it uses
operator 'aten::scaled_dot_product_attention'
- i have an issue that convert_sklearn seems not to be able to use requested opset 14. how can i fix that with convert_sklearn?here code to reproduce it:
outout:
pip list:
cc @xadupre