pdf2text Search Results

192 results
for pdf2text

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

CTeX-org/forum #18

ctex + lwarp 生成的 html 出现文字顺序颠倒

## 检查 - [x] 已在 issues 中进行搜索（包括已关闭的问题） ## 编译环境 - 操作系统 - [x] Windows 7/8/10 - TeX 发行版 - [x] TeX Live 2019 pretest ## 描述问题 ctex + lwarp 生成的 html 出现文字顺序颠倒（见最小例子第 11 个脚注），改成 art…

views63 updated 12 months ago
8
deanmalmgren/textract #229

Getting " failed with exit code 127" on windows 10.

I have heard from different sources this error is associated with windows 10 . `textract.exceptions.ShellError: The command `pdftotext ../data/input/example_resumes\Brendan_Herger_Resume.pdf -` fail…

arsalan993 updated 11 months ago
29
deepset-ai/haystack #482

Extract passage headers during processing of PDF documents

### What to do When converting PDF documents to txt with either apache tika or pdf2text we have some functionality to split the documents by passages afterwards. It would be beneficial to have per pa…

Timoeller updated 1 year ago
5
madewild/tac #53

Give PDF examples (for pdf2text in s1_convert - module 2)

the first part of the conversion notebook converts pdf to text is there a possibility to provide the students with the pdf corpus also ?

devironl updated 1 year ago
8
pymupdf/PyMuPDF #1943

Ghostscript pdf not recognised

_**Please provide all mandatory information!**_ ## Describe the bug (mandatory) I have a flock of PDFs that are have in the following attributes: Producer: GPL Ghostscript 9.15 PDF Version: 1.4 …

grego1981 updated 1 year ago
5
ruby-gnome/ruby-gnome #1488

Update poppler bindings

- It would be nice to have bindings for pdftohtml, some outdated bindings for Python can be found at https://github.com/mgedmin/pdf2html . Packages such as Nokogiri would enable elegant processing - …

bkmgit updated 1 year ago
3
dnGrep/dnGrep #704

Show the output of PdfToText in the preview pane instead of …

Related: #659 Just a proposal to make the preview pane for PDFs more useful. I know that you can't show the real PDF because there is no mapping between the text you search through and the posit…

Dromantor updated 2 years ago
3
chrismattmann/tika-python #376

Content returns gibberish for some PDFs

Tika works fine for most PDFs – however I have some files, that Tika simply returns gibberish for in the content. Not sure as to why it is, since the `parser` interface doesn't seem to allow for m…

alfonsrv updated 1 year ago
3
cpierce/pdf2text #1

Cannot decode my PDF files though the sample one is fine

I cannot get any decoded text out of either of these: [148154.pdf](https://github.com/cpierce/pdf2text/files/7877397/148154.pdf) [temp.pdf](https://github.com/cpierce/pdf2text/files/7877398/temp.pdf…

MandyShaw updated 2 years ago
1
climatepolicyradar/navigator #347

Pdf2text cli > Create document object from json

The cli developed in #53 will output json documents containing the text for each document. Add the ability to create a Document object from this json. This will be useful when using the corpus of ext…

chrisaballard updated 2 years ago
1

上一页 1...7 8 9 10 11 12 13...20 下一页

192 results for pdf2text

192 results
for pdf2text