Closed ajenhl closed 10 years ago
To support TACL operating on the (extracted) Pagel Tibetan corpus documents, add a suitable tokenizer (whitespace separated tokens) and a means for specifying that it should be used in generating n-grams and making reports.
To support TACL operating on the (extracted) Pagel Tibetan corpus documents, add a suitable tokenizer (whitespace separated tokens) and a means for specifying that it should be used in generating n-grams and making reports.