Closed aaron-sandoval closed 5 months ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
I couldn't come up with a better name for MazeTokenizer2
. The best idea that Claude had IMO was MazeTokenizerPlus
, but this seemed no better and no worse to me. Bigger name changes like MazePreprocessor
seemed to obfuscate that this is a direct replacement of MazeTokenizer
. Lmk if you have a better idea.
maze_dataset._load_tokenizer_element
for saving/reading is at least a little bit wrongzanj.loading.load_item_recursive
to properly leverage zanj and ensure scalability to non-primitive args
This PR deprecates
MazeTokenizer
and adds a substitute architecture which is more powerful and easier to build upon to create new tokenizers.