-
Hi,
In this thread [https://github.com/taishi-i/nagisa/issues/6 ](url) they discuss a comparison between several tokenizers. The conclusion drawn is the following:
> F1-scores
> KyTea > nagisa…
-
Hi, I am trying to get sentence embeddings on Japanese via the docker. However, the output is **empty** since Mecab is not installed.
**Output in python**:
{'content': JAPANESE_SENTENCE, 'embeddi…
-
Could you include some notes briefly comparing this to other parses like Mecab? Mecab includes a comparison to other tokenizers/parsers. I think users would greatly benefit from knowing things like pa…
-
**Describe the bug**
CUDA memory fills up during the first training epoch of Dutch NER model. The bug started to occur since `flair-0.4.4`. Training works as intended in `flair-0.4.3`
**To Reprodu…
-
Make `WordTokenizer('KyTea')` and `WordTokenizer('kytea')` the same.
himkt updated
5 years ago
-
Hi,
I like this sentence tokenizer.
But, I use this package as only sentence tokenizer.
Therefore I do not like to install required packages ['natto-py', 'kytea', 'sentencepiece']
I would like to …
-
It's hard to build fugashi in Windows because
- long_description in setup.py doesn't have encoding so that parsing setup.py fails
- lacking include_dirs for Extension prevents finding mecab.h
- Una…
-
I have a few questions about evaluation:
1. Did you use `-tok intl` for all directions (en-fr, fr-en, ja-en, en-ja), or just for en-ja?
2. The instructions specify to run:
kytea -m /pa…
-
## Problem
Fail to build from source.
## How to reproduce it
```
$ autoreconf -i
$ ./configure
$ LANG=C make
```
Environments:
* Debian unstable
* g++ 7.3.0
```
$ g++ -v
Using …
-
Please help add `test do` test blocks to the formulae missing them. If you see something on the list that now _does_ have a test block, just comment below, and one of the maintainers will check the bo…