elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
21 stars 99 forks source link

Add support for DeBERTa-V2 tokenizer #717

Closed maxhniebergall closed 1 month ago

maxhniebergall commented 3 months ago

This PR adds support for uploading models based on DeBERTa-V2 and V3.

Unfortunately there's currently no way test test this, due to an incompatibility between the Huggingface DeBERTa model and our pytorch implementation https://github.com/huggingface/transformers/issues/20815