-
Hello,
As I am approaching the end of my enterprise to add support for french, I notice that the performance of the coreference resolution is a lot hampered by the performance of the used spacy mod…
-
## How to reproduce the behaviour
The Danish transformer strip accent leading to the same wordpieces of meaningfully different words.
```
>>> import spacy
>>> nlp = spacy.load('da_core_news_tr…
-
supprimer les parenthèses ouvrantes dans le participe et les génétifs, et le point d'exclamation dans le lemme
```
concustōdĭo=cōncūstōdĭo|audio|cōncūstōdīv|cōncūstōdī(|is, ire, iui, itum|2
Aether2…
-
Spacy 2.3, `en-core-web-lg`
“I can't go”: (orth / lemma)
```
I / -PRON-
ca / can
n't / not # HERE
go / go
```
Spacy 3.0.2, `en-core-web-lg`
```
I / I
ca / ca # HERE
n't / n't # HERE…
-
## The problem
As of right now, the lemmatised string of idioms contains whitespaces (except the hyphenated ones) like so:
```
tokenisation: ['You', 'are', 'down to earth', '.']
lemmatisation: […
-
I was doing some basic parsing tests and found that a very mundane word was lemmatised incorrectly. The Dutch word eten ("to eat") is incorrectly lemmatised as "emmen" when given in its singular form.…
-
See also #26, putting here for more focused discussion. We may need to use different logging mechanisms to show information to users and to store for later debugging.
- What do we need to log and …
-
Lemmas for hyphenated words replace the hyphen with a "v".
I.e. "gozdno-lesen" --> "gozdnovlesen" (see document ParlaMint-SI_2020-06-15-SDZ8-Redna-18).
This is a common occurrence in the Slovenian…
-
Above the first Arabic lemmatisation example:
supplied by an term elements → a term element, I presume.
-
Idea to have configurable text cleaning node.
This node also have predefined template to clean tweets, facebook feed, app reviews etc.
For detail refer https://github.com/lalitpagaria/obsei/issues…