-
sbd has a bug if the sentence ends with a capitalized word(eg proper noun). Once this is fixed we should revert our regex.
https://github.com/Tessmore/sbd/issues/44
-
**Current Circumstances:
I've integrated the stanford ner functionality into a java service that has tens of millions of calls per day. I deployed in 3 clusters and allocated 10GB of memory per pod.…
-
I am currently using Ragas to evaluate my RAG application, which is built using llama index . I've encountered a few issues in the generated results:
1- When generating queries using `TestsetGenera…
-
## How to reproduce the behaviour
[Colab notebook demonstrating problem](https://colab.research.google.com/drive/14FFYKqjRVRbN7aAVmHUYEao9CwahY0We?usp=sharing)
When parsing a sentence that con…
-
A [user reported](https://community.opentargets.org/t/html-markup-included-in-text-mining-results/1307) that there are leftover HTML tags in sentences. Beyond being a cosmetic problem, apparently this…
-
time env LC_ALL=en_US.UTF-8 \
~/factored-segmenter/src/bin/Release/netcoreapp3.1/linux-x64/publish/factored-segmenter train \
--model ~/factored-segmenter/out/enu.deu.generalnn.joint.segment…
-
This just popped up for me for the first time. Running the `recognize` function (with whisper.cpp, built with OpenBLAS, on CPU) on what is, as far as I know, not a pathological audio sample (it's an a…
-
Since `Intl.Segmenter` ([link](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Segmenter)) is available for most of the users, I think it's better to mention what…
-
Hello,
I have been using pragmatic segmenter by following the steps below:
sudo apt-get install ruby-full
gem install pragmatic_segmenter
And after install the pragmatic_segmenter I got this:
…
-
hi,
we have used your segmenter to deal with very big corpus(wiki dump) with size 320MB, it is written in Kazakh but the segmenter going to segment a very very long sentence. Because of a very long …