-
2600_yay [brings attention](https://discord.com/channels/955819881989808128/1059573570122023022/1275988510716985415) to a tool that can be used to pull data from the SEC
They link to [this reposit…
-
looks like its doing that thing again where its not creating projects or downloading asset packages. Its been stuck at the part and never finishes. tried five or six time now. i'v tried deleting the…
-
Now that we have a validation framework for Ex. 21 extraction, try these simple improvements and re-evaluate performance.
```[tasklist]
### Tasks
- [ ] Investigate CorpWatch dataset to see if we …
-
Currently TARA madis script needs nonempty plaintexts. This means that we can perform TARA project reference extraction only for a fraction of the whole documents, as we have about 10M documents with …
-
## Background
As a result of the recent removal of the CV option, it came to light that some teams were using other assessment options that did not take advantage of the self-extraction and word count…
-
[đź‘Ť ] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question.
**Your Question**
During testset generation using langchain docs, is on…
-
### Issue: Comparing GROBID and Docling for Parsing Scholarly Publications
#### **My Use Case**
We need to parse and extract all relevant information from (1000s) of scholarly publications, such…
-
Many of the documents need to be translated prior to automated entity extraction. Current work arounds only work with text searches, not entity or association searches. Also limits effectiveness of …
-
Hi everyone,
Has anyone tried fine-tuning DonUT for key information extraction on a corpus with documents half-digital and half-handwritten? Specifically, I am wondering if anyone has any evidence …
-
Document the following features. Some of this documentation may need to be in the use cases in langchain extraction.
- [ ] Retrieval Mode
- [ ] Brute Force Extraction
- [ ] Deduplication -- how i…