Closed markustobler closed 1 year ago
This file does not cause problems in my test environment. I tested it with pdftotext
version 20.09.0
and 0.86.1
.
Which version of pdftotext are you using?
I'am using "tpwd/ke_search": "^4.5"
. The error occurs only on the automatic cronjob. If I start the scheduler task manually there is no error. I have no idea how I could debug this? Somebody in the TYPO3 Slack channel told me that this error could be a problem with pdftotext. It could also be something else.
Errors from pdftotext and pdfinfo will now be logged to the ke_search error log. That should make it at least easier to find the problematic files. The patch is in the current master and will be in version 4.6.0.
I got the following errors in my cronjob:
Syntax Error: Marked object is wrong type (boolean) Syntax Error: Marked object is wrong type (boolean) Syntax Error: Marked object is wrong type (boolean) Syntax Error: Invalid object stream Syntax Error: Invalid object stream Syntax Error: Invalid object stream Syntax Error: Marked object is wrong type (boolean) Syntax Error: Marked object is wrong type (boolean) Syntax Error: Marked object is wrong type (boolean) Syntax Error: Marked object is wrong type (boolean) Syntax Error: Marked object is wrong type (boolean)
These errors are coming from pdftotext which apperantly has a problem reading the PDF files. At least the error handling could be improved so that in the ke_search log it is shown which files are problematic.
An example pdf file which causes an error is attached to the issue.
nvs-seminarprogramm-2023-1.pdf