Open tomachalek opened 1 year ago
This is related to CNC's preflight subcopora and it would allow for better random samples as compared to the current solution where we just take first N tokens of a corpus.
preflight
This is related to CNC's
preflight
subcopora and it would allow for better random samples as compared to the current solution where we just take first N tokens of a corpus.