deepset-ai / haystack

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.73k stars 1.92k forks source link

Is it possible to combine Intel OpenVino and haystack for inference? #365

Closed laifuchicago closed 4 years ago

laifuchicago commented 4 years ago

To Author: Currently our company is trying to shorten the inference time by using "cpu", our team member just have an idea that to put intel OpenVino and haystack together. But the question is, is that possible to integrate with Elasticsearch?

Intel OpenVino Toolkit https://docs.openvinotoolkit.org/latest/index.html

Jonathan Sung

tholor commented 4 years ago

Hi @laifuchicago ,

We currently don't support OpenVino. You would need to implement a new "OpenVinoReader" class. Have you tried using ONNX models? This is also considerably faster on CPU than pure PyTorch. We already support it in FARM and it would be easier to add in Haystack than OpenVino.

If you want to go forward with OpenVino: I haven't used it yet, but it seems that you can convert models from ONNX to their format (https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_ONNX.html).

laifuchicago commented 4 years ago

To Author: I haven't tried ONNX yet. Is it also faster then using GPU? I mean ONNX(GPU) vs pytorch(GPU).

Jonathan Sung

tholor commented 4 years ago

Yes, there is a small improvement on GPUs, too. From our last benchmarking ONNX was about 20-30% faster (when you apply these optimizations). We plan to publish a dedicated benchmarking website with speed + accuracy numbers for haystack soon. It might be worth to add numbers for ONNX there later, too.

laifuchicago commented 4 years ago

To Author: (1) If I want to convert pytorch model( xlm-roberta) to onnx , is there any reference? How to set the parameters such as dummy input? (input id, token_type and attention mask ?) The figure is my xlmroberta trained by FARM. ask deepset1

The following code is the sample (pytorch to onnx) #################################################### model_onnx_path = "model.onnx"

The inputs "input_ids", "token_type_ids" and "attention_mask" are torch tensors of shape batch*seq_len

dummy_input = (input_ids, token_type_ids, attention_mask) input_names = ["input_ids", "token_type_ids", "attention_mask"] output_names = ["output"] ''' convert model to onnx ''' torch.onnx.export(model, dummy_input, model_onnx_path, \ input_names = input_names, output_names = output_names, \verbose=False) ####################################################################################### (2) If we use onnx, will your Haystack also can be retrieved by Elasticsearch?

Thank you Jonathan Sung

tholor commented 4 years ago

Hey @laifuchicago,

I saw that you also opened an issue in FARM. Let's continue the discussion there and close this one here to avoid duplicate threads.