Cannot use latency-optimised model?

BhandarkarPawan commented 2 years ago

I am trying to perform latency optimisation on a Unet Segmenter model.

I was successful in optimising and predicting using this model for a the compression mode in about 6 hours. This process went went and I got a model that reduced from 56 MB to 9MB

So I wanted to try with latency and added "optimization": "latency" to the config dict.

The process took 21.5 hours and displays the following logs at the end:

2022-03-09 16:47:50 - INFO: Exporting to ONNX format
Exporting model from ModelFormat.PYTORCH to ModelFormat.ONNX
PyTorch2ONNX saved model in ModelFormat.ONNX format at path '/home/pawan/unet-segmenter/src/opt_model.onnx'
2022-03-09 16:47:51 - INFO: Exporting to pkl format
2022-03-09 16:47:51 - ERROR: Unfortunately this model cannot be pickled
2022-03-09 16:47:51 - INFO: Exporting to ONNX format
Exporting model from ModelFormat.PYTORCH to ModelFormat.ONNX
PyTorch2ONNX saved model in ModelFormat.ONNX format at path '/home/pawan/unet-segmenter/src/second_best_model.onnx'
2022-03-09 16:47:52 - INFO: Exporting to pkl format
2022-03-09 16:47:52 - ERROR: Unfortunately this model cannot be pickled
2022-03-09 16:47:52 - INFO: The engine successfully optimized your reference model, enjoy!
2022-03-09 16:47:52 - INFO: Job with ID 762689AF finished
2022-03-09 16:47:52 - INFO: Total execution time: 21:30:42 (d, hh:mm:ss)
2022-03-09 16:47:52 - INFO: Log has been exported to: /home/pawan/.neutrino/logs/762689AF-2022-03-08.elog

The docs do not mention how to use the ONNX file, and the Neutrino Pickle file is not generated, which means I am unable to use this model.

The log file is encrypted as usual so I cannot check.

I have copied the complete logs here Please help me understand how to fix it. Thank you.

BhandarkarPawan commented 2 years ago

Following up on this issue, any updates would be appreciated.

yasseridris commented 2 years ago

@BhandarkarPawan Unfortunately we currently do not support exporting optimized model from latency mode to pickle format(we have an open issue for that). Your only option for now is to load and run the model using onnx runtime.

yasseridris commented 2 years ago

@BhandarkarPawan Also, note that the Neutrino.run method returns a pytorch model object that you can directly use in the unlikely case that you want to use it in the same python process right after optimization.

Deeplite / neutrino

Cannot use latency-optimised model? #18