-
Hello,
I am using the wtpsplit library to split text into segments. However, I would like to have the ability to limit the maximum length of each segment to a specific number of characters, such as…
-
**Proposal**: Assess critical variables affecting semantic search quality on legal corpus before implementing semantic search on CL.
**Key Variables**:
- Embedding model
- Chunking strategy
- I…
-
Tools explored:
1. PymuPDF
2. DIT (https://huggingface.co/spaces/nielsr/dit-document-layout-analysis)
3. Nougat: https://huggingface.co/spaces/ysharma/nougat
4. AXAParser
1,2and 4 have document…
-
Llamaindex's `SemanticSplitterNodeParser` can sometimes produce chunks that are too large for the embedding model. Unfortunately there is no max length option for the semantic chunking to avoid this i…
-
I do not see this in my manager list, updated all recently.
I see the other nodes listed below from salt but not this one.
[SaltAI-Open-Resources](https://github.com/get-salt-AI/SaltAI)
[SaltAI…
-
**Describe the Feature**
Most service APIs now support enforcing schema outputs through function calling, json mode, or structured generation.
It would be really useful to have an option that would …
-
-
I"m working through the llama_docs_bot files and there is an issue with the InstructorEmbeddings class that relies on BaseEmbedding:
Running the following:
```
# set the batch size to 1 to avoi…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I am working on building an orchestrator for a chat system using llamaIndex Query …
-
Hey,
I would like to add the functionality of the CitationQueryEngine into my next application. I am planning on porting it myself if there is no current plan on porting it to TS.
Is this planned?…