-
We need to develop some text segmentation techniques.
- sentence
- word
- word count
- character count
language
- [ ] English
-
Thank you for making the WebNLG dataset with the alignment available!
We would like to align sentences in the `original text` and the triples in `sortedtripleset`.
**Is there a function/procedur…
-
```
What steps will reproduce the problem?
Segment the following Text in Ratel:
This is the first sentence.This is the second sentence.
Using the following rules:
Before break: \.
After break: ()
W…
-
Often, users may want to apply this sentence segmenter within larger pipelines, e.g. one use case is segmenting document-level data into sentence-level segments that can be easily translated by senten…
-
Hi, we would lke to know more about how decisions for sentence segmentation for Old Church Slavonic were made.
Unfortunately, this link is broken. http://folk.uio.no/daghaug/syntactic_guidelines.pdf
…
-
The version of HKCanCor published on [HuggingFace](https://huggingface.co/datasets/nanyang-technological-university-singapore/hkcancor/tree/main) by NTU is different from the version offered by this l…
-
# Task Name
Dialect Segmentation
## Task Objective
This task aims to identify and differentiate dialects from audio samples from various regions of the United States. Regardless of the countr…
-
**Reported by vgjh2005 on 2014-04-15 10:16**
Hi
Please add Chinese words segmentation support. Separating each word in English is very simple. But Chinese is a very very complex language. It is very d…
-
Hi there,
I have trained a transformer that is giving very good and precise results in most of the sentences for my problem (Sanskrit word segmentation). However in about 10% of the sentences it s…
-
Make it possible to do sentence segmentation and tokenization using MASC, e.g. along the lines of: https://github.com/scalanlp/chalk/wiki/Chalk-command-line-tutorial