All infrastructure & work related to dictionaries. Medium term this delivers misc dictionaries such as VRI, DPD. Long term: Enhanced Digital Pāli-LanguageX dictionary.
Formal data integrity tests for the information contained in
Pāli English Dictionary-full.csv and Pāli English Dictionary-vocab.csv
A few thoughts, must be able to
run offline so data can be cleaned before uploading to github
add new tests quite easily, as this happens on a regular basis. being able to use regex would help as i understand it.
run hard tests which fail and soft tests which just show results
have exclusion lists for known irregular words.
@parthopdas wrote an Anki filter export - this is currently where the data gets reviewed on a daily basis, and manual (i.e fallible) data integrity tests are run, 162 in total at the moment. Here is that file as a reference.
Formal data integrity tests for the information contained in
Pāli English Dictionary-full.csv
andPāli English Dictionary-vocab.csv
A few thoughts, must be able to
@parthopdas wrote an Anki filter export - this is currently where the data gets reviewed on a daily basis, and manual (i.e fallible) data integrity tests are run, 162 in total at the moment. Here is that file as a reference.
saved_filters.json.zip