-
Hello,
I am facing an issue with dspy using a Custom LM. The LM is Mistral Instruct v0 2 7B deployed using a local inference server on LM Studio. According to LM Studio this is how the model is cal…
-
It should be possible to filter documents by presence of query word in their text content.
For this, it's necessary to implement:
1. Text-layer extraction from PDF and other text-based documents.
…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
## Project Goal
ChatPDF is an AI-powered tool designed to revolutionize document analysis by combining contextual understanding, multilanguage support, data visualization, and smart recommendation fe…
-
I'm running thepipe locally to extract some page URLs for processing with GPT4o, and it seems that the image generated for each page only captures the content above the fold (See example below). Is th…
-
### No version of the `Community Guidelines` of service `Indiatimes` is recorded anymore since 2 May 2024 at 9:08:00 UTC
The source documents have been recorded in snapshots, but no version can be …
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a…
-
Hi,
At present, I have all documents as DOCX (Microsoft Word files) which I convert to PDF in order to run the GROBID XML conversion. Is there any possibility of using DOCX as input?
In case of …
-
not the same as add WebLoader, but LangChain has this hook to pass in a HTML page content, and some other settings. Under certain conditions this could be very userful to customize the HTML content or…
-
It seems that trafilatura attempts to parse [schema.org](https://schema.org)-compliant structured data [such as microdata](https://github.com/adbar/trafilatura/blob/6a7b58174add10f686e230d5213203ff85b…