ml-explore / mlx-data

Efficient framework-agnostic data loading
MIT License
362 stars 40 forks source link

Fix docs re: tokenization #21

Closed andersonbcdefg closed 10 months ago

andersonbcdefg commented 10 months ago

The example for tokenization assumes that mlx.data.tokenizer_helpers.read_trie_from_spm just returns a CharTrie, however it also returns weights. I updated the docs to reflect that it returns a tuple. Feel free to remove the piece where I pass the weights along to the Tokenizer, I'm not sure if that's correct.