In order to improve the results of itwêwina search, one dimension would be whether a potential search target occurs in any or all of the glossaries for introductory Cree text books (Okimâsis and Ratt) or courses (NS152).
That is now available in the ALTLab repo, in: crk/generated/crk_glossaries_aggregate_vocab.tsv
This file was created with the script: crk-aggregate-core-vocab.sh, as follows (when run at the root directory of the ALTLab repo:
The incorporation of this file, alongside the corpus-based lemma counts (that we have on the basis of the A-W and B corpora), as well as the dictionary-based mean morpheme frequencies (which needs to be updates as well), should next be considered.
In order to improve the results of itwêwina search, one dimension would be whether a potential search target occurs in any or all of the glossaries for introductory Cree text books (Okimâsis and Ratt) or courses (NS152).
That is now available in the ALTLab repo, in:
crk/generated/crk_glossaries_aggregate_vocab.tsv
This file was created with the script:
crk-aggregate-core-vocab.sh
, as follows (when run at the root directory of the ALTLab repo:crk/bin/crk-aggregate-core-vocab.sh /Users/arppe/altlab2/crk/ | sort -k1,1nr -k2,2 > crk/generated/crk_glossaries_aggregate_vocab.tsv
The incorporation of this file, alongside the corpus-based lemma counts (that we have on the basis of the A-W and B corpora), as well as the dictionary-based mean morpheme frequencies (which needs to be updates as well), should next be considered.