-
@kleinay
Shany mentioned that we get lower scores on entity recognition - probably since we group noun compounds.
I think we should:
1. Get her evaluation code somewhere on github (probably first …
-
```
What steps will reproduce the problem?
1. Use a token filter that contains some set of words
2. Use an accepted word list that contains a disjoint set of words
3. run any semantic space main and s…
-
- https://polyglot.readthedocs.io/en/latest/MorphologicalAnalysis.html
- http://www.dartmouth.edu/~deutsch/Grammatik/Wortbildung/Wortbildung.html
- https://coerll.utexas.edu/gg/gr/mis_02.html
> C…
-
```
What steps will reproduce the problem?
1. Use a token filter that contains some set of words
2. Use an accepted word list that contains a disjoint set of words
3. run any semantic space main and s…
-
In words like လျော့ "to lessen, diminish" and လျာ "to gauge, size up", the /l/ sound is merged or reduced, such that the words are pronounced /jɔ̰/ and /jà/ respectively, instead of /ljɔ̰/ and /ljà/…
-
- [ ] integrate "family" - list of words with the same prefix root combination, or same first word in compounds.
**Words belong to the same family if all the following columns are identical**
1.…
bdhrs updated
3 years ago
-
```
What steps will reproduce the problem?
1. Use a token filter that contains some set of words
2. Use an accepted word list that contains a disjoint set of words
3. run any semantic space main and s…
-
Hi @adbar ,
Recently I started noticing that some inflected words are not correctly lemmatized.
However, when adding German to the list of languages that are processed by the affix decomposition s…
-
I tried using the vocab builder on the German Wikipedia, but some words aren't accurately represented into its sub words, for example, "eintausendneunhundertneunzig" is considered as a one sub word, a…
-
```
What steps will reproduce the problem?
1. Use a token filter that contains some set of words
2. Use an accepted word list that contains a disjoint set of words
3. run any semantic space main and s…