chainyo / transformers-pipeline-onnx

How to export Hugging Face's 🤗 NLP Transformers models to ONNX and use the exported model with the appropriate Transformers pipeline.
23 stars 0 forks source link

Overload pipeline of model hosted on triton server #4

Open leopra opened 1 year ago

leopra commented 1 year ago

I'm quite confused on how to implement this. Once converted the ner model to onnx I want to deploy it to a triton server. The issue is that I would like to get the full inference on the triton server but it looks like I can just receive the logits and then locally compute the entities. Is there a way to "send" the Overcharged TokenClassificationPipeline to the triton server so that the Triton Inference call directly returns the dictionary of entities?