praeclarum / transformers-js

Browser-compatible JS library for running language models
MIT License
217 stars 17 forks source link

Tokenizer vs SentencePiece: Implementation Similarity and Converting sentencepiece.model to JSON #7

Closed tylike closed 4 months ago

tylike commented 1 year ago

Hi, Is the implementation of tokenizer the same as Google's SentencePiece? For example, will the same input have the same output when calling encode? If so, how can I convert sentencepiece.model file to a json file?

Thank you.