-
**Is your feature request related to a problem? Please describe.**
Yes and no, it is somewhat of a niche case. The current implementation of document processing in LangChain4J, specifically in handli…
-
Often we split up sentences after aligning into `aligned_subsegments`.
In the end we could run the pipeline again on these smaller subsegments now, get a new trellis shape and assign a new ratio base…
-
I understand how difficult it is to split sentences that contain abbreviations and that adding abbreviations can have pitfalls, as it is nicely explained in #2154. However, I have stumbled upon some c…
-
Hi Greg,
Thanks a lot for you work!
I want to share with more optimized version of your function `combine_sentences` from the [tutorial about text splitting](https://github.com/FullStackRetrieva…
-
Hi team
we're using this to split text to sentences, but we found that some charater been replaced after splitting
e.g, ASCII 32 and 160
how can I keep the orignal character, I need to do som…
-
Is there any way to adjust tokenizer parameters that how the tokenizer(?) divides the sentences? May I ask how sentence-splitting is done when the program is configured to (being~by) feed generator it…
-
Exception: Still failed after 3 attempts: JSON parsing still failed after 3 attempts: ❎ API response error: Missing required key: `split` Please check your network connection or API key or `output/gpt…
-
The official example to calculate faithfulness on a single sample straight from the website doc doesn't work:
```
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import Langchain…
-
Occasionally, Google Cloud TTS returns the following error:
> Some sentences generate audio that is too long. Consider splitting up long sentences with sentence breaking punctuation (e.g. periods),…
-
Is there any way to diarize only without splitting the text in sentences? I just tried to erase the code for audio splitting, but this didn't work. I am a noob in coding and don't know what I am doing…