-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
Hi,
I am using SimpleDirectoryReader to read documents, then call get_nodes_from_docum…
-
I get garbled characters when parsing pdf file. The file I use is [this](http://www.aas.net.cn/fileZDHXB/journal/article/zdhxb/2012/8/PDF/20120812.pdf). There may be encoding issues?
## Environment…
-
When encrypting PDF files, there is no verification whether the reserved permission bits are passed correctly. This seems to allow for PDF files which do not completely follow the PDF 1.7 specificatio…
-
I am trying to get a somewhat reliable estimate of the number of visual (non-whitespace, non-metadata) characters in pdf files. For this, I use the `extract_text` function.
I stumbled across a situ…
-
After Installing fontype2 with
$ ./configure
$ make
$ sudo make install
and pypdf, reportlab with
$ sudo easy_install pip
$ sudo pip install pypdf
$ sudo pip install reportlab
I'll get an option …
-
We are currently experiencing regular issues with arXiv documents not being available for the Windows CI due to rate limit issues. At the same time, most of these documents are available under permiss…
-
I was trying to use the exact same example mentioned in [here](https://pypdf.readthedocs.io/en/latest/user/extract-text.html#example-1-ignore-header-and-footer), but it gives blank output, even though…
-
**Is your feature request related to a problem? Please describe.**
i'm trying to make chainlit working with open interpreter, using example provided in your documentation
should be possible, but wit…
-
Issue is to track efforts of parsing PDFs and any articles/documents relating to this.
Currently 'marker' is used https://github.com/VikParuchuri/marker
This requires a separate venv and I have do…
-
How do i make it so that it is possible for me to make a cell where there is a limited width and if reached i get a new line just underneath it? Like so:
Text - "I like pypdf as it lets me edit and m…