ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.2k stars 1.19k forks source link

Refactor SentencePieceTokenizer #4032

Open mhabedank opened 1 month ago

mhabedank commented 1 month ago

The SentencePieceTokenizer is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.