-
Hello, I am writing because I have been using your Statistical Chunker to perform semantic chunking on a large (but not too large text dataset). The quality of the produced chunks is rather good and t…
-
With multi-line strings in text marks I'm trying to find a way to be able to turn a label coming from the data values like:
"National Museum of Art and Charts"
into a 3 element string array l…
atiro updated
2 months ago
-
### Summary 💡
## Dynamic Data Ingestion using Langchain Text Splitters and LLM
Implement a system where a Large Language Model (LLM) dynamically selects appropriate text splitters from Langchain's…
-
**Proposal**: Assess critical variables affecting semantic search quality on legal corpus before implementing semantic search on CL.
**Key Variables**:
- Embedding model
- Chunking strategy
- I…
-
### Is there an existing issue for the same feature request?
- [X] I have checked the existing issues.
### Is your feature request related to a problem?
```Markdown
The context accuracy of chunk di…
-
When scraping the ranking of movies on Douban, the message "IE 11 is not supported. For an optimal experience, visit our site on another browser" appears. I also encountered the same problem when scra…
-
**Is your feature request related to a problem? Please describe.**
I would like an optional flag for chunking strategy 'hi-res' so that tables are extracted but not images. Image text extracted as …
-
The **JSON** file that I'm `/upsert`'ing contains data like the following:
```
{
"id": "Series",
"text": " Series[f,x,Subscript[x, 0],n] generates a power series expansion for f a…
-
## Background
This library is written; but it would be really cool if a website existed where users could test it out first.
## Acceptance Criteria
### Scenario: Users can perform text chunking on th…
-
Hi, it seems to be a loss of words/sentence when shunning text into paragraphs, any suggestion how to solve it?