pdf2txt Search Results - Githubissues

360 results
for pdf2txt

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pdfminer/pdfminer.six #936

TypeError: 'int' object is not iterable

![image](https://github.com/pdfminer/pdfminer.six/assets/49911294/e0200827-d1fb-49a3-b5a4-8cfbeee319f1) ![image](https://github.com/pdfminer/pdfminer.six/assets/49911294/52d1561e-8b60-4603-9431-535…

GuoQuanhao updated 4 months ago
1
tabulation2010/pdftoref #2

Updating Pdfminerr

``` Hi again, I would like to update the Pdfminerr library to the newest version, but simply replacing the files into the directory doesn't work. It gives this error: Traceback (most recent call l…

GoogleCodeExporter updated 8 years ago
2
pdfminer/pdfminer.six #942

Detection of ligatures - ﬁ problem

**ﬁ problem** Bug occurs when strings such as: "fi", "ffi", "fl", "ff" are present in text: e.g.: "efficient", "final", "stiff" **Example with word "find"**: ``` # Previous span Some climate …

P0L3 updated 4 months ago
1
Unstructured-IO/unstructured #3325

bug/Two Column PDF partition result in incorrect text.

**Describe the bug** When running partition on a two column pdf, text extraction puts characters is the wrong position **To Reproduce** [two_col.pdf](https://github.com/user-attachments/files/16037…

pfcharles updated 4 months ago
3
liunian-Jay/MU-GOT #3

vllm Model architectures不支持问题

[rank0]: Traceback (most recent call last): [rank0]: File "/data/chuzuowei/chem/Vary-main/MU-GOT/pdf2txt.py", line 1, in [rank0]: from PDF_parsing import pdf2md, vllm_got, process_md [rank0]…

xiaochake updated 1 week ago
1
pdfminer/pdfminer.six #556

Bad HTML markup generated

**Bug report** **- A description of the bug** Bad HTML markup generated while using `pdf2txt.py test.pdf -t html -o test.html` **- Steps to reproduce the bug.** 1. Use the following documen…

andrei-volkau updated 2 years ago
2
iacopomasi/pdftoref #2

Updating Pdfminerr

``` Hi again, I would like to update the Pdfminerr library to the newest version, but simply replacing the files into the directory doesn't work. It gives this error: Traceback (most recent call l…

GoogleCodeExporter updated 9 years ago
2
kjam/data-wrangling-video #1

Missing/not updated files

-There is no pdf2text.py file. -the module openpyxl==2.3.5 gives an error "no module openpyxl.style" when trying to create a .xls file ,although it worked when I degraded it to 2.3.4.

dgarciac updated 8 years ago
1
pdfminer/pdfminer.six #836

New hOCR renderer fails to escape or clean text properly

**Bug report** The new hOCR renderer does not escape characters that need escaping. [This PDF](https://github.com/pdfminer/pdfminer.six/files/10032060/AandP.pdf) contains the string "A&P", which sh…

slbayer updated 3 months ago
4
documentcloud/docsplit #20

Extracting text from PDFs

When trying to use docsplit to extract text from some PDFs I found out that some text is mixed; I understand that docsplit is a thin layer over other tools (in fact, pdftotext is who to blame for mixi…

runa updated 11 years ago
1

上一页 1...4 5 6 7 8 9 10...36 下一页

360 results for pdf2txt

360 results
for pdf2txt