This PR introduces ONNX Support to the Following Annotators :
InstructorEmbeddings
DistilBertForZeroShotClassification
RobertaForZeroShotClassification
BartTransformer
GPT2
XlmRobertaForZeroShotClassification
HubertForCTC
Wav2Vec2ForCTC
BartForZeroShotClassificaiton
DebertaForZeroShotClassificaiton
This PR introduces OpenVino support to the Following Annotators:
WhisperForCTC
AlbertForSequenceClassification
AlbertForTokenClassification
AlbertoForQuestionAnswering
BartForZeroShotClassification
BertForQuestionAnswering
BertForSequenceClassification
BertForTokenClassication
BertForZeroShotClassification
This PR introduces ONNX support to the instructor Embedding, we use a slightly different approach compared to the TensorFlow implementation, however, it's similar to what the Python Instructor Library is using. In their implementation, they take the average of the token embeddings and perform a linear activation on the output tensor before normalizing. We do the exact same except we don't perform the linear activation. ( we also produce slightly different results from the average pooling implementation we have compared to theirs )
This PR introduces ONNX Support to the Following Annotators :
InstructorEmbeddings
DistilBertForZeroShotClassification
RobertaForZeroShotClassification
BartTransformer
GPT2
XlmRobertaForZeroShotClassification
HubertForCTC
Wav2Vec2ForCTC
BartForZeroShotClassificaiton
DebertaForZeroShotClassificaiton
This PR introduces OpenVino support to the Following Annotators:
WhisperForCTC
AlbertForSequenceClassification
AlbertForTokenClassification
AlbertoForQuestionAnswering
BartForZeroShotClassification
BertForQuestionAnswering
BertForSequenceClassification
BertForTokenClassication
BertForZeroShotClassification
This PR introduces ONNX support to the instructor Embedding, we use a slightly different approach compared to the TensorFlow implementation, however, it's similar to what the Python Instructor Library is using. In their implementation, they take the average of the token embeddings and perform a linear activation on the output tensor before normalizing. We do the exact same except we don't perform the linear activation. ( we also produce slightly different results from the average pooling implementation we have compared to theirs )