-
![image](https://github.com/pdfminer/pdfminer.six/assets/49911294/e0200827-d1fb-49a3-b5a4-8cfbeee319f1)
![image](https://github.com/pdfminer/pdfminer.six/assets/49911294/52d1561e-8b60-4603-9431-535…
-
```
Hi again,
I would like to update the Pdfminerr library to the newest version, but
simply replacing the files into the directory doesn't work.
It gives this error:
Traceback (most recent call l…
-
**fi problem**
Bug occurs when strings such as: "fi", "ffi", "fl", "ff" are present in text:
e.g.: "efficient", "final", "stiff"
**Example with word "find"**:
```
# Previous span
Some climate …
P0L3 updated
4 months ago
-
**Describe the bug**
When running partition on a two column pdf, text extraction puts characters is the wrong position
**To Reproduce**
[two_col.pdf](https://github.com/user-attachments/files/16037…
-
[rank0]: Traceback (most recent call last):
[rank0]: File "/data/chuzuowei/chem/Vary-main/MU-GOT/pdf2txt.py", line 1, in
[rank0]: from PDF_parsing import pdf2md, vllm_got, process_md
[rank0]…
-
**Bug report**
**- A description of the bug**
Bad HTML markup generated while using `pdf2txt.py test.pdf -t html -o test.html`
**- Steps to reproduce the bug.**
1. Use the following documen…
-
```
Hi again,
I would like to update the Pdfminerr library to the newest version, but
simply replacing the files into the directory doesn't work.
It gives this error:
Traceback (most recent call l…
-
-There is no pdf2text.py file.
-the module openpyxl==2.3.5 gives an error "no module openpyxl.style" when trying to create a .xls file ,although it worked when I degraded it to 2.3.4.
-
**Bug report**
The new hOCR renderer does not escape characters that need escaping. [This PDF](https://github.com/pdfminer/pdfminer.six/files/10032060/AandP.pdf) contains the string "A&P", which sh…
-
When trying to use docsplit to extract text from some PDFs I found out that some text is mixed; I understand that docsplit is a thin layer over other tools (in fact, pdftotext is who to blame for mixi…
runa updated
11 years ago