Open Samyak2 opened 2 years ago
The source I was using had only 5000 words (for free). Added 500-5000 words lists in 7c049c5acdf736958f4db155811edfaa9c9cdb8c, which is coming in v0.4.0.
I think the word list size is misleading - textgen.rs only looks at words between 2 and 8 characters. There are 927 words in the 5000 wordlist, for example, that didn't meet this criteria (925/927 were larger than 8 chars). Maybe you could allow word size preference to be specified as a parameter (but default to between 2 and 8)?
I think the word list size is misleading - textgen.rs only looks at words between 2 and 8 characters. There are 927 words in the 5000 wordlist, for example, that didn't meet this criteria (925/927 were larger than 8 chars). Maybe you could allow word size preference to be specified as a parameter (but default to between 2 and 8)?
Good catch! The 2 to 8 chars filter was quite arbitrary. --min-length
and --max-length
flags to specify this would be nice, although that will require a bit of work to make the RawWordSelector
store the ToipeConfig
too.
What and why?
Currently, the only built-in word list is the top 250 words list. This is very limiting as words will often repeat again in the same line and multiple times throughout a test.
It would be nice to have these word lists too:
How?
More info about the existing word list: https://docs.rs/toipe/latest/toipe/wordlists/constant.TOP_250.html
The word list needs to be added in this directory: https://github.com/Samyak2/toipe/tree/main/src/word_lists
and it needs to be listed here: https://github.com/Samyak2/toipe/blob/main/src/wordlists.rs