-
Hello all,
The segmentation pipeline seems to output more IndexOutOfBoundsException in latest 3.8.0 version. I got an IndexOutOfBoundsException on some sentences, especially the ones with weird cha…
-
Hi folks,
I like this cool segmenter for quality and speed, but something is a bit weird.
```python
from syntok.segmenter import analyze
text='''Alexandri Aetoli Testimonia et Fragmenta. Studi …
-
Related to https://github.com/meilisearch/MeiliSearch/issues/1331
-
https://github.com/nipunsadvilkar/pySBD
The benchmark shows that its accuracy very significantly outperforms Spacy:
https://github.com/nipunsadvilkar/pySBD/blob/master/artifacts/pysbd_poster.png
…
-
Hello Florian,
Thank you for developing such a powerful NLP library. I gotta say, I have tried all the NLP libraries for sentence tokenization and none of them even comes closer to your creation.
…
-
The following code has an error
```
from amseg.amharicSegmenter import AmharicSegmenter
sent_punct = []
word_punct = []
segmenter = AmharicSegmenter(sent_punct,word_punct)
words = segmenter.am…
-
I am trying to build tensorflow-text in a condo environment with an Apple Machine Learning version of Tensorflow
on Big Sur on an Intel iMac w/AMD Radeon GPU.
I am using bazelisk
E.g.
$ bazelis…
-
Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does:
- …
-
This error appears also in the original log, that has been produced during ParlaMint sample creation. @TomazErjavec reported by email (2021-04-20):
> There are still some mistakes in CoNLL-U, ... Als…
-
Actually, LSTM segmenter may be focused for East Asian Language such as Thai. But I am not sure for Japanese and Chinese. Since UAX#29 doesn't define word segmenter for Chinese and Japanese, ICU uses…