Open dax-westerman opened 1 month ago
The terms are not enumerated from 0, but we can do that. Right now they are grouped in chunks of 1000s, but it's more than fine to enumberate if we have a system.
The terms are not enumerated from 0, but we can do that. Right now they are grouped in chunks of 1000s, but it's more than fine to enumberate if we have a system.
As it pertains to the second point, I should have left that as a separate "idea" rather than include as part of the effort, so sorry for any confusion. This was an area of exploration I'd wanted to include as a potential means of managing the dictionary which would provide a method for validation as well. I'm going line-strike it to keep the issue clean :)
Thanks!
Need to remove comma entry from res/dicts/dict.txt
This involves the following steps:
A more automated mechanism might leverage a Pandas DataFrame, in order to avoid manual manipulation:A common load method (pandas.read_csv)A common means of updating using DataFrame/Series methodsA common means of validating using DataFrame masks and a framework to evalA common means to persist the output