pkuzqh / Recoder

MIT License
51 stars 10 forks source link

Which pkl is used for tokenization? #21

Open guoweijun137 opened 6 months ago

guoweijun137 commented 6 months ago

Hello!I found your work to be exceptionally insightful and engaging. I noticed that there are three pkls in your project, namely char voc.pkl, code voc.pkl and nl_ voc.pkl, so which file is used for tokenization of code readers?

pkuzqh commented 4 months ago

char_voc.pkl and nl_voc.pkl are used to tokenize code for code readers.