-
- PHP Version: 8.1.2
- PDFParser Version: 2.7
### Description:
### PDF input
https://cdn.yinyuezhushou.com/static/7d38770d31c3cd66219eaa1b7959e2dd.pdf
### Expected output & actu…
-
```
this is the error.log
Traceback (most recent call last):
File "./peepdf.py", line 541, in
console.cmdloop()
File "/usr/lib/python2.7/cmd.py", line 142, in cmdloop
stop = self.onecmd(…
-
The Sample Code in the Readme file indicates that PDFParser takes in a RandomAccessFile and a string as a constructor.
There is no constructor present with this signature however.
ghost updated
4 years ago
-
This is maybe more of a PHP question than pdfparser one but here it goes anyway:
Important part of my code:
```
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile($file_path);
…
-
Es gibt einzelne PDF-Dateien, die nicht extrahiert werden könen. Bei ihnen gibt es in Tika, dem PDFParser, der von Solr verwendet wird, eine Fehlermeldung.
Wenn diese Dateien mit externen Tools val…
-
In `ocd_backend.utils.file_parser` we use the python version of Apache Tika as a fallback when the mimetype is not 'application/pdf'. We use `pdfparser.poppler` as first choice since it has a native b…
-
I'm currently doing this:
``` python
fp = open(pdf_fpath, 'rb')
parser = PDFParser(fp)
doc = PDFDocument(parser)
```
With the pdfs I'm working with, PDFDocument is really too slow. Is there an…
-
Long story short, one of the PDF files I'm trying to parse throws an error - it basically cannot be parsed. Thing is that I cannot catch it with the standard try-catch block, because the `parseFile()`…
-
- PHP Version: 8.2
- PDFParser Version: 2.9
### Description:
I want to parse some CV, and I have sometimes wrong character.
I would to try to parse correctly the pdf, and if not possibl…
-
- PHP Version: 7.4
- PDFParser Version: 2.9.0
### Description:
### PDF input
Cannot provide pdf since its confidential
### Expected output & actual output
Need to extract table fro…