-
I'm trying to write a rule which corefers surnames with full instances of those names. The question is, in the entities file is it possible to indicate the lemma of the entity or just the form ?
`…
-
There are two possibilities for dealing with multiword expressions, that basically depends on the tokenisation scheme. I think that they are both more or less compatible.
The first is to lexicalise …
-
cf. Inuktitut example in data/gdrive
-
Dear all,
I just tried to install the elasticsearch plugin on my local machine for testing.
I do not have much experience, so please be patient.
The command of point 6. is failing
bundle exec rake …
-
I know this could be very tricky to implement right, but this case is very common in slavic names https://www.kmu.gov.ua/en/team
```In [64]:
HumanName('Ivanov Ivan Ivanovich')
Out[64]:
In [6…
-
We want to be able to do parsing of any language supported by LinkGrammar, starting with English, to be available both internally in Aigents framework and via Aigents Language API.
**Specs:**
1.…
-
I have a training data where each token is a word and I've already extracted a few features like NER, POS and CHUNK for each token. But I have a problem when I try to extract character n-grams feature…
-
In the current code, quotation marks are removed from the sentence.
Actually, they are converted to whitespace. This means they serve as a word separator.
For example:
**This"is a test"**
is converted…
ampli updated
6 years ago
-
Sometimes it is desirable to be able to say that a token is in a language different from the main language of the file, and to specify the foreign language. Some corpora have occasional code switching…
-
UD documentation states:
"The indirect object of a verb is any nominal phrase that is a core argument of the verb but is not its subject or direct object."
One example of this is:
```
төрт екіге қ…