-
(Collected by @LanguageTool-AS)
- [ ] hyphenated compounds with numbers on the left side are false positives: `2-cd-single`, `20-minuten pauze`. Many cases with `3D-` (see full results in nightly tes…
-
_The original plan was to lift the regexes directly, but I’d forgotten that Standard Ebooks is a GPL3 codebase, and here is MIT. Obviously we can’t copy everything directly over, so the new plan is th…
-
Hi @amir-zeldes ,
So @strasss found some sort of structure that's pretty common in colloquial speech, which we weren't sure what to do with.
I think Omer eventually opted for compound, but since I…
-
Sometimes, method names contain underscores to separate words (`to_string`) and sometimes not (`tolist`). This catches me of regularly. Particulary annoying are the cases where the same prefix (eg. `t…
-
The [whitelist](https://github.com/sanskrit-lexicon/hwnorm1/tree/master/ejf/hwnorm1c/whitelist) directory contains work aimed at identifying headwords of the various Sanskrit dictionaries that _may_ h…
-
The current understanding at W3C is that Khmer text behaves like Thai when lines are wrapped. See http://w3c.github.io/i18n-drafts/articles/typography/linebreak.en#sec_se_asia for a very high-level su…
r12a updated
4 months ago
-
The current understanding at W3C is that Lao behaves like Thai when lines are wrapped. See http://w3c.github.io/i18n-drafts/articles/typography/linebreak.en#sec_se_asia for a very high-level summary.
…
r12a updated
4 months ago
-
Working on a Danish spelling dictionary, I am attempting to circumvent the default handling of hyphens. Like German, Danish is a language that allows a lot of word compounding. The project has defined…
-
> Netzwerkkennen, Netzwerkgehen
and many others are accepted by our speller. Idea was to extend `ProhibitedCompoundRule` with code like this that checks if the "compound" never occurs in ngram data…
-
I train an MFA model with my own datasets, i have 3000 wavs per person, and i have 55 person's data, i use below command to train and align:
mfa train -j 32 -o model_1116.zip datasets/audios tools/al…