Closed hanzigs closed 3 years ago
OK, will try to find that and get back
whether these are those
Those are output nodes. They are not tables. I might be able to improve our table loading. Give me 45 minutes.
Here is some Error Lines
another one
Unfortunately what I tried didn't work. I still need either a shared_name or resource_handle for the tensorflow backend to give me the values stored in the table. The keras model must have them somewhere because it needs to pull those values to save the model. The existing search code actually traces the model save code to find the resources, but they must be overriding the defaults or something. Hard for me to say without seeing the model.
the key value for the search should be "resource_handle" is it?
the key value for the search should be "resource_handle" is it?
Or shared_name. Actually shared_name is slightly better. Ideally we want both. They are normally next to each other, and they might start with an underscore prefix ("_shared_name").
I've got another idea that I actually think will work but it's late so I'll try it tomorrow/Monday.
I found these few things, there is _shared_name and _handle_name Each of the _undeduplicated_weights different names for each
if this is correct, will try monday
pip uninstall tf2onnx
pip install git+https://github.com/onnx/tensorflow-onnx@tom/keras_hash_tables
The improved method grabs the resource handle from graph captures and makes up a shared_name if it fails to find one in the model.
That was perfect, It worked as expected, Thank you very much, and greatly appreciated for the support.
Excellent. And @hanzigs have you confirmed that the model loads in onnx runtime and produces correct results when run on the validation data?
Yes, python model and onnx model reproduces the same expected results on validation data
Awesome! Thanks for helping us debug this.
please let me know when this is ready in pip install tf2onnx
May I know these warnings affect anything
INFO:tf2onnx.tfonnx:Using tensorflow=2.5.0, onnx=1.10.0, tf2onnx=1.10.0/32d758
INFO:tf2onnx.tfonnx:Using opset <onnx, 9>
WARNING:tf2onnx.shape_inference:Cannot infer shape for model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2: model/multi_category_encoding/string_lookup_1/None_lookup_table_find/LookupTableFindV2:0
WARNING:tf2onnx.shape_inference:Cannot infer shape for model/multi_category_encoding/Cast_1: model/multi_category_encoding/Cast_1:0
INFO:tf2onnx.tf_utils:Computed 0 values for constant folding
WARNING:tf2onnx.onnx_opset.tensor:ONNX does not support precision, scientific and fill attributes for AsString
INFO:tf2onnx.optimizer:Optimizing ONNX model
INFO:tf2onnx.optimizer:After optimization: Const -20 (29->9), Identity -2 (2->0)
sorry, there is a difference in the prediction results between python and onnx model from Flask, But in python file, both produces same results. Above is the only warnings, no errors happening.
Awwww, thought we had fixed this. The models from python and flask are different. Keep in mind that autokeras can choose very different model architectures depending on the data it is given. I think it is likely you are training the python and flask models on different data.
The "does not support precision, scientific and fill attributes for AsString" might or might not matter depending on how the lookup table is formatted. Can you upload the converted onnx model from flask again?
The warning - we can't handle all attributes from AsString(), ie. instead of float 123. onnx would have float 123.000000. Not sure if it hurts in this case - it might if the category mapper is behind it because the lookup table would have the tf representation. int32, int64 should be ok, float might run into issues. Not sure when autokeras starts using AsString() - for the examples I tried I saw it always used String_To_Number().
Thanks for that, uploaded the converted onnx model in the drive.
Regarding different results, actually I meant the python model in flask and the same model converted to onnx in flask, those two prediction results are different
But if I build model in python file and convert to onnx, those prediction results are same, I'm very confused why is that
Ah, do you get the same results between the flask keras model and the flask onnx model?
I think you are almost certainly getting different models in flask and python. I'm not sure why, but I suspect you are giving autokeras different input data or different args.
yeah, here
onnx_model, _ = tf2onnx.convert.from_keras(model)
This is inside Flask model is python object, the prediction result is 0.60987216 onnx_model result is 0.5559953 for same test data
Not sure whether those warnings make any difference
The category mapper in the model looks like:
That's likely not going to work. That said, I find the whole thing a little strange since the result is immediately cast back to float. Seems highly unlikely that this lookup table is useful. @hanzigs are you using real testing data on this? Do you find that the TF model produces useful results on non-training data?
Also are you certain you are running the python script with the same data, args, and virtual environment as the flask app?
Reg: same data, args, and virtual environment, yes I'm sure about that. because, this flask app has got 7 models, keras seq, lgbm, xgb, randomforest, extratrees, decisiontree and autokeras, all other 6 models working perfect, same way the data, args are passed, so I'm sure those are correct in flask app.
Reg cast back to float, not sure what's that, but testing data is correct,
May I understand what will be the problem for the above please
There is no complex code happening in autokeras
akmodel = StructuredDataClassifier(max_trials=AK_Hyperparameters['max_trials'])
akmodel.fit(x=X_train, y=y_train, validation_data=(X_valid, y_valid), epochs=AK_Hyperparameters['epochs'])
autoKeras_model = akmodel.export_model()
Category Fields are normalized using woe transformation, Numeric fields are normalized using MinMaxScalar, separately Is this an issue
Testing data transformation follows the same steps of normalization for prediction
Flask app is bit complex to take out a miniature version, because it is linked the database which is elasticsearch on each and every step, thats why I can't send the flask app code.
The issue is that the input to the CategoryMapper (lookup table) comes from an AsString op in TF, which converts a number to a string. There is no corresponding op in ONNX, so we convert to a Cast, but that won't necessarily use the same precision. aka 0.0 becomes "0.0" not "0.00000". The lookup for 0.0 will return 1 not 5 and the results may be different. If we were doing int to string, it would be consistent, but float to string is more problematic.
Flask app is bit complex to take out a miniature version, because it is linked the database which is elasticsearch on each and every step, thats why I can't send the flask app code.
Is the data from the database used to train the autokeras model?
Yes it is
How does the python script get the data then? What data does it use?
python elasticsearch client to pull the data
Can you pickle the data from each and compare that they are identical?
yes i can
the prediction testing happens from POSTMAN
The issue is that the input to the CategoryMapper (lookup table) comes from an AsString op in TF, which converts a number to a string. There is no corresponding op in ONNX, so we convert to a Cast, but that won't necessarily use the same precision. aka 0.0 becomes "0.0" not "0.00000". The lookup for 0.0 will return 1 not 5 and the results may be different. If we were doing int to string, it would be consistent, but float to string is more problematic.
But if I build model step by step in a python file by calling only functions of flask app and test, it works fine
Anyway will check that, Thanks for the support, much appreciated, You can close this ticket.
Hi @TomWildenhain-Microsoft I have uploaded 4 models in the drive, Out of that, 3 model with name tf2onnx.... Is it possible to check the CategoryMapper cast back to float is in those models. Because all these models giving perfect results, but build from python file using functions from Flask app.
The 4th model having the catmapper build from flask app, not giving correct result Thanks
I visualized in netron, couldn't find Categorymapper in the 3 models Not sure why it's not there
model created with flask having it
Not sure what's the Flask app making difference, why Categorymapper showing up in flask model not in python file model
Hi @TomWildenhain-Microsoft whether 'tom/keras_hash_tables' branch not available for installation? Thanks
Hi, Added a Colab notebook with data and autokeras model building with the prediction difference (shared in the drive) https://colab.research.google.com/drive/1DqlJgGZuKf5nev9G6Do7DYEMEU4aAQhy
https://drive.google.com/drive/folders/1HfB00dOuk-awSmIrSg92hmJFYzTpQNCr?usp=sharing
Let me know whether its possible to convert, Thanks
Below code works perfect when run in python file (python==3.9.5, tensorflow==2.5.0, keras2onnx==1.7.0, onnxruntime==1.8.0, keras==2.4.3, tf2onnx==1.9.1)
Same code inside Flask App, InferenceSession throws error
If that's a converter bug, how should I find the correct opset? (I have tried opset from 9 to 13, all throws error) then why that error not raised in standalone run?
Any help please, Thanks