Closed datalee closed 3 years ago
in my test,convert a pytorch Retriever model to onnx, and deploy it in cpu env, tps from 5 to 90+, time costs from 2s to 700ms(max). If haystack could automatically load the onnx model directly, that would result in a significant performance improvement.
Hey datalee, that would indeed be a cool feature. Especially because DPR can be rather fast and a small query encoder could actually be useful on a CPU only server.
The feature is not implemented yet, for this we would need the convert_to_onnx method defined in the AdaptiveModel implemented in the BiadaptiveModel. Would you like to work on this and contribute?
Hey datalee, that would indeed be a cool feature. Especially because DPR can be rather fast and a small query encoder could actually be useful on a CPU only server.
The feature is not implemented yet, for this we would need the convert_to_onnx method defined in the AdaptiveModel implemented in the BiadaptiveModel. Would you like to work on this and contribute?
i use the transformers function
from transformers.convert_graph_to_onnx import convert
sure, we also use that function in our AdaptiveModel with some adjustements.
Does the function work for DPR models? IF so, would you like to contribute the functionality to farm?
sure, we also use that function in our AdaptiveModel with some adjustements.
Does the function work for DPR models? IF so, would you like to contribute the functionality to farm?
yes i works, but i just test query_encoder model
Nice one, so would you like to implement the convert_to_onnx functionality into our BiAdaptiveModel? As you said you might need to separate query and passage encoder conversion for it...
Hey @datalee would you like to contritbute? Otherwise I think we can close the issue since you were able to convert the models to onnx, right?
Question how to convert a dpr Retriever model to onnx? thks