Closed hanzigs closed 3 years ago
Looks we don't support the AsString() op. Let me check if we can handle this in the converter.
Is there a work around like custom op till we get the converter update please, Thanks
I have some code that maps AsString to ONNX Case which kind of works but doesn't honor all attribute AsString has. But maybe its good enough for autokeras. If it works for autokeras I'll send a PR.
worked for me for the structured_classifier example so we merged a PR: https://github.com/onnx/tensorflow-onnx/pull/1648
You can try with
pip install git+https://github.com/onnx/tensorflow-onnx
@guschmue Thank you very much for quick response
Now I am getting this error, No Op registered for LookupTableFindV2 with domain_version of 9 created a fresh env and installed pip install git+https://github.com/onnx/tensorflow-onnx (python==3.9.5, tensorflow==2.5.0, tf2onnx==1.10.0, onnxruntime==1.8.0)
sess = onnxruntime.InferenceSession(content)
File "C:\Users\plg\Anaconda3\envs\automl07augpy395elk7120\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 283, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Users\plg\Anaconda3\envs\automl07augpy395elk7120\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 312, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2 : No Op registered for LookupTableFindV2 with domain_version of 9
As before it works normally in a python file, but in a flask app throws error, found similar in https://github.com/onnx/tensorflow-onnx/pull/1228
What are the shapes and dtypes of X_train
, y_train
, X_valid
, y_valid
? Can you upload a zipped saved model of the keras model? Sorry I've never used autokeras before.
X_train, y_train, X_valid, y_valid , (1056, 16) (1056,) (191, 16) (191,) respectively, all numpy.ndarray (python==3.9.5, tensorflow==2.5.0, tf2onnx==1.10.0, onnxruntime==1.8.0) creating model is simple, I can attach the pickles of X_train, y_train, X_valid, y_valid, May I know where please
pip install autokeras==1.0.15
from autokeras import StructuredDataClassifier
akmodel = StructuredDataClassifier(max_trials=10)
akmodel.fit(x=X_train, y=y_train, validation_data=(X_valid, y_valid), epochs=100)
autoKeras_model = akmodel.export_model()
onnx_model, _ = tf2onnx.convert.from_keras(model)
content = onnx_model.SerializeToString()
sess = onnxruntime.InferenceSession(content)
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
Can you upload the pickles to OneDrive/GoogleDrive/Dropbox and post a link? Are those all np.int32 or np.float32?
yes they are
https://drive.google.com/drive/folders/1HfB00dOuk-awSmIrSg92hmJFYzTpQNCr?usp=sharing
attached in google drive, you can open with
import pickle
with open('filename','rb') as f: arrayname1 = pickle.load(f)
Great, I just requested access to the drive link.
I just run conversion and it works for me. The resulting model runs in ORT and produces results. However, my model does not contain AsString, maybe I'm using a different version of autokeras. My converted onnx model looks like this:
Actually as said before model created successfully in python file, and InferenceSession creates successful in python file
InferenceSession throws error in flask app
Ah, so sorry. Didn't catch that. What version of onnxruntime does the flask application use?
Same kind of issue as in #1228
All same versions
Same kind of issue as in #1228
Can you please elaborate on this? Are you getting a "Default value of table lookup must be const." error? Are you running the conversion code within flask too, or just onnxruntime? You can save both the keras saved model and the onnx model with:
ExportedautoKeras_model.save("autokerasmodel")
onnx_model, _ = tf2onnx.convert.from_keras(ExportedautoKeras_model, output_path="autokeras.onnx")
I find it very surprising that you get different results in flask. Is your flask running from a different virtualenv? Are you sure your autokeras version is the same?
Yes I am using same model for conversion to onnx
Inside flask, creating the model and converting it, and trying to get the session results all at once
I think it is very likely that the keras models you get in flask and the plain python script are different. Can you please add this line:
ExportedautoKeras_model.save("autokerasmodel")
and zip the results of the python and flask scripts?
Inside Flask App, I have two functions, one model creation and passing the model to onnxconverter function, not sure is that a issue, now will try to put both in same function,
That should not be an issue. Again to confirm, are you using the same virtualenv for flask as the python script?
yes, python and flask are in same env
Is the training data you are using (X_train, y_train, X_valid, y_valid) the same values for both?
Also in normal python file, onnxConversion and InferenceSession works But when i do prediction from onnx model it throws error like
content = ONNXModel.SerializeToString()
sess = onnxruntime.InferenceSession(content)
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onnx = sess.run([label_name], {input_name: test_record})[0]
Are you able to capture the keras saved model from flask?
Can you please confirm the prediction test.
yes i can
Can you please confirm the prediction test.
I am able to successfully run predictions using the onnx model I have generated. The model is uploaded to the shared drive folder as as autokeras_tw.onnx
yes i can
Awesome. Please capture and upload the keras saved models and converted onnx models for flask and the python script and upload them to the Google Drive folder as autokeras_flask.zip
, autokeras_flask.onnx
, autokeras_python.zip
, autokeras_python.onnx
. If I have those, I may be able to reproduce the issue. So far, I can't reproduce it at all.
I have uploaded a "ONNXmodel.onnx" and "creditloan_prediction_20210806T210907" in the drive, can you please try to create a session from any of the two, both created in flask
Both models give me the error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\tomwi\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 283, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Users\tomwi\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 310, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from C:\Users\tomwi\Downloads\ONNXModel.onnx failed:This is an invalid model. Error in Node:model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2 : No Op registered for LookupTableFindV2 with domain_version of 9
But I will need a saved model to diagnose the cause of the conversion failure. If you are not able to upload a saved model due to privacy/security concerns, I can try to walk through the debugging on your end, or we can wait for @guschmue who might have better luck reproducing the issue with autokeras.
Yes, thats the saved model from flask, and thats the error Regarding files, will have to create a separate one, because thats have a huge links with other files, will send once created yeah, i am ok for the walk through, let me know how
I have printed the onnx model from flask run and saved in a file "onnxfile.txt" and uploaded in the drive, if that helps Also uploaded the onnx file for the same "flask_onnx_model.onnx", this also throws error Thank you very much for the support, much appreciated
So the issue is here: The LookupTableFindV2 op shouldn't be in the final model. It should be removed by: https://github.com/onnx/tensorflow-onnx/blob/04d24880751e4f753623d9097819e793e75962a9/tf2onnx/custom_opsets/onnx_ml.py#L67
Try to determine whether that handler is running and if it is entering that conditional.
Also is anything printed to the console during conversion, and are any exceptions raised? Change the log level before conversion with:
import logging
logging.basicConfig(level=logging.INFO)
Here is the full error from flask error before printing the model
ERROR:tf2onnx.tf_loader:Could not find table resource to replace placeholder model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2/table_handle
INFO:tf2onnx.tfonnx:Using tensorflow=2.5.0, onnx=1.10.0, tf2onnx=1.10.0/04d248
INFO:tf2onnx.tfonnx:Using opset <onnx, 9>
WARNING:tf2onnx.shape_inference:Cannot infer shape for model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2: model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2:0
WARNING:tf2onnx.shape_inference:Cannot infer shape for model/multi_category_encoding/Cast_1: model/multi_category_encoding/Cast_1:0
INFO:tf2onnx.tf_utils:Computed 0 values for constant folding
WARNING:tf2onnx.onnx_opset.tensor:ONNX does not support precision, scientific and fill attributes for AsString
ERROR:tf2onnx.tfonnx:Failed to convert node 'model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2' (fct=<bound method LookupTableFind.version_8 of <class 'tf2onnx.custom_opsets.onnx_ml.LookupTableFind'>>)
'OP=LookupTableFindV2\nName=model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2\nInputs:\n\tmodel/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2/table_handle:0=Placeholder, [], 7\n\tmodel/multi_category_encoding/AsString:0=Cast, [-1, 1], 8\n\tmodel/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2/default_value:0=Const, [], 7\nOutpus:\n\tmodel/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2:0=None, 7'
Traceback (most recent call last):
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\tf2onnx\tfonnx.py", line 292, in tensorflow_onnx_mapping
func(g, node, **kwargs, initialized_tables=initialized_tables, dequantize=dequantize)
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\tf2onnx\custom_opsets\onnx_ml.py", line 34, in version_8
utils.make_sure(shared_name is not None, "Could not determine table shared name for node %s", node.name)
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\tf2onnx\utils.py", line 260, in make_sure
raise ValueError("make_sure failure: " + error_msg % args)
ValueError: make_sure failure: Could not determine table shared name for node model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2
INFO:tf2onnx.optimizer:Optimizing ONNX model
INFO:tf2onnx.optimizer:After optimization: Const -19 (29->10), Identity -2 (2->0)
this error during creating Session
ERROR:AutoMLWebApi:500 Internal Server Error: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2 : No Op registered for LookupTableFindV2 with domain_version of 9
Traceback (most recent call last):
File "C:\pl\AutoML\AutoMLIntuitionJuly2021py395elk7120\AutoMLWebApi.py", line 88, in autoML
result = AutoMLTrainer.train(tenant_id, data)
File "C:\pl\AutoML\AutoMLIntuitionJuly2021py395elk7120\AutoMLTrainer.py", line 903, in train
uploadToElastic(es,
File "C:\pl\AutoML\AutoMLIntuitionJuly2021py395elk7120\AutoMLUtils.py", line 3927, in uploadToElastic
sess = onnxruntime.InferenceSession(content)
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 283, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 312, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2 : No Op registered for LookupTableFindV2 with domain_version of 9
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\flask\app.py", line 1513, in full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\flask\app.py", line 1499, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
File "C:\pl\AutoML\AutoMLIntuitionJuly2021py395elk7120\AutoMLWebApi.py", line 96, in autoML
abort(500, err)
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\werkzeug\exceptions.py", line 940, in abort
_aborter(status, *args, **kwargs)
File "C:\Users\pl\Anaconda3\envs\AutoMLIntuitionAug2021py395elk7120\lib\site-packages\werkzeug\exceptions.py", line 923, in __call__
raise self.mapping[code](*args, **kwargs)
werkzeug.exceptions.InternalServerError: 500 Internal Server Error: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2 : No Op registered for LookupTableFindV2 with domain_version of 9
INFO:werkzeug:127.0.0.1 - - [07/Aug/2021 10:58:40] "POST /tenant_id/train HTTP/1.1" 500 -
Could not determine table shared name for node
oh haha you should have lead with that. Is that what you meant by "Same kind of issue as in #1228". #1228 had a few error messages involved.
tf2onnx isn't finding the values for the lookup table.
Try this: make a saved model, then convert it with python -m tf2onnx.convert --saved-model mysavedmodel --output model.onnx
from the command line.
actually this will used for deployment, this procedure should be followed, if i make it work from cmd line will not be useful
even saving model to disk is not useful, coz this will be deployed in docker container
is there a workaround for that please
Ok, then we need to find the data in the keras model's lookup table without saving it. The challenge here is that the lookup table is held inside the TF runtime and it can be challenging to extract it. Normally a second copy is stored on the keras model itself, and this method searches for it:
You'll need to find where the lookup table info is stored on the keras model.
ok, how to do that?
Can you attach a debugger?
is that mean, the flask in debug mode?
Are you using VSCode or another IDE?
yeah VSCode
Great. Launch the app in debug mode: https://code.visualstudio.com/docs/python/tutorial-flask#_run-the-app-in-the-debugger
Drop a breakpoint after model creation. Use the debug console to explore the model's attributes... sorry it's a bit tricky.
yeah here is the screen shot, what to be checked there
do I need to step into from_keras() and go to this function def _get_hash_table_info_from_trackable(trackable, table_names, key_dtypes, value_dtypes, ?
One moment, I'm checking something.
In the model, you want to find the table's resource handle and dtype. The resource handle is key, since it lets you request the lookup table's contents from tensorflow. For mine, it is stored on the lookup table layer:
Going a few layers deep, I find:
You really shouldn't have to find it manually since we do a pretty comprehensive search, but autokeras must put it somewhere we don't expect.
If you find it, we can update our search to find it automatically for you based on the attributes it end up on.
Below code works perfect when run in python file (python==3.9.5, tensorflow==2.5.0, keras2onnx==1.7.0, onnxruntime==1.8.0, keras==2.4.3, tf2onnx==1.9.1)
Same code inside Flask App, InferenceSession throws error
If that's a converter bug, how should I find the correct opset? (I have tried opset from 9 to 13, all throws error) then why that error not raised in standalone run?
Any help please, Thanks