-
Hi,
According to the [ud annotation guidelines](https://universaldependencies.org/u/dep/flat.html) `flat` seems to be the most apt relation for handling locales like "Rio de Janeiro", "Sao Paulo", …
-
## ざっくり言うと
- Transformerはinputの長さが固定されるため,固定長の関係性しか表現できないが,segment毎に隠れ状態を再帰的に用いることで,複数segmentの関係性を埋め込むことを可能にした
- 複数segmentを扱うためにrelative positional encodingsを用いた
- いくつかのデータセットにおいて言語モデルとしてSOTAを達成
…
-
### Metadata
- Authors: Marek Rei and Helen Yannakoudakis
- Organization: University of Cambridge
- Conference: BEA@EMNLP 2017
- Link: http://www.aclweb.org/anthology/W17-5004
-
## リソース
- Natural Language Toolkit http://www.nltk.org
- Universal Dependency v2 http://universaldependencies.org
- CoNLL 形式
ダウンロードしたものを文単位で分割して"../auto/univ_dep_train/*.txt"とした。
train[:1000…
-
According to the standard should space-like characters such as [zero width space (U+200b)](https://util.unicode.org/UnicodeJsps/character.jsp?a=200b) be included in the tokens or skipped like the norm…
vvi56 updated
8 months ago
-
I tried
```
from nltk.tokenize import word_tokenize
a="g, a, b, c, 123, g32,12 123121 {1}"
word_tokenize(a)
```
**Output I am getting:**
['g', ',', 'a', ',', 'b', ',', 'c', ',', '123', ',', 'g…
-
I'm trying to get my head around what this project actually does. Im mostly interested in how you parsed the C&C Tools Marked file and adding a grammar, and parsing logical forms.
-
When I try to detect pronouns, they seem to be tagged as "Pers" which stands for person I guess.
Example list of Pers:
```Pers ben
Pers Sen
Pers ben
Pers ben
Pers Sen
Pers sen
Pers ben
Pe…
-
The question is one of the traditional 4 sentence types, where we have _elicitation of information_.
In [POS PART](https://universaldependencies.org/u/pos/all.html#al-u-pos/PART),
the examples i…
-
### Describe the bug
I ran into two `flair`-related issues while using the Word Swap by Inflections transformation. The first one required a `flair` update, and the second required a small `textatt…