-
### Feature request
class PyPDFLoader in [document_loaders/pdf.py](https://github.com/hwchase17/langchain/blob/master/langchain/document_loaders/pdf.py) to accept bytes object as well.
### Motiv…
-
> Please provide us with the following information:
> ---------------------------------------------------------------
### This issue is for a: (mark with an `x`)
```
- [X ] bug report -> pleas…
gw37 updated
6 months ago
-
### Description of the bug
[mscbookin.pdf](https://github.com/user-attachments/files/15982045/mscbookin.pdf)
![mscbookin pdf_0](https://github.com/pymupdf/PyMuPDF/assets/22074904/e42a8126-8f2b-4…
-
Historically on the containers version of Dangerzone the conversion happens on a second container. This was needed since Dangerzone relied on many linux-native programs for conversion such as Graphics…
-
Why im getting this error when i upload a scanned pdf file ? everything was working correctly until i changed some configs to make mistral model work on gpu but i guess that is not related to the pro…
-
## Goal:
Create an interactive PDF viewer that allows users to view the PDF and its parsed text side by side, interact with selectable bounding boxes on the PDF, and obtain JSON outputs for selected…
-
wants:
- much better caching that can be easily defined
- better long term storage solution than storing html blobs in a sqlite db
- be more extensible to support easily adding other parsers (such …
-
### Discussed in https://github.com/pymupdf/PyMuPDF/discussions/3567
Originally posted by **serhii-brovarnyk** June 11, 2024
Hello!
I have a PDF file with only one page I got via another to…
-
If you use Simplified Chinese, some characters will be hidden(fig1), but it is normal to use Traditional Chinese(fig2)
![image](https://user-images.githubusercontent.com/74350447/218245042-cce31d3e…
-
The branch I'm working off is `main-10k-extraction` and [this notebook](https://github.com/catalyst-cooperative/mozilla-sec-eia/blob/main-10k-extraction/notebooks/06-kl-main-10k-extraction.ipynb) cont…