facebookresearch / fastText

Library for fast text representation and classification.
https://fasttext.cc/
MIT License
25.85k stars 4.71k forks source link

Parametrizing MAX_VOCAB_SIZE #1218

Open aheyman11 opened 3 years ago

aheyman11 commented 3 years ago

fastText has a hard-coded MAX_VOCAB_SIZE parameter, which can be a serious limitation for users who have a large corpus of distinct words. It would be great to parametrize this value so that users who are able to provide fastText more memory can enjoy a larger learned vocabulary.