Open Laykou opened 2 years ago
@boazsegev For some reason this fix https://github.com/boazsegev/combine_pdf/commit/b966e703fd897ff50832d3823e74791099b82ca3 broke it
Hi @Laykou
Thank you for opening this issue.
Please note my comments: here for issue #185 and here for issue #191.
I usually prefer lax parsers that allow formatting errors to be ignored when possible. However, issue #185 showed that a specific type of error cannot be safely ignored, which required that the parser become more strict.
I strongly suspect, from the description of the issue, that the specific PDF file is malformed.
Testing the PDF @ https://www.datalogics.com/products/pdf-tools/pdf-checker/ fails ... the testing suite doesn't even recognize the file as a PDF, not to mention listing the errors.
I have been authoring and maintaining this gem by myself for over 7 years and have been looking for a new maintainer for over 2 years. The community is enjoying my work, but not really contributing, so... 🤷🏼♂️ ... please forgive me for not investing more time and effort to solve this issue.
Kindly, Bo.
Hi @boazsegev , It appears that the Length property of the stream can be incorrect in more cases than the presence of the 'endstream' keyword within the content. Anyway, preferring one over another way to extending the scanner position leads to issues. Many of these issues are acceptable for the end users, provided result looks well. E.g. swallowing the "index is out of range" error would fix the parsing of the file attached. Then it can be combined and work can be done. Can we swallow the error "index is out of range" and display warning for this case? Would such a PR make sense?
Do you think this could be fixed in a newer version?
Getting index out of range (RangeError)
on a user uploaded PDF in version 1.0.26 as well.
Hi @Laykou
Thank you for opening this issue.
Please note my comments: here for issue #185 and here for issue #191.
I usually prefer lax parsers that allow formatting errors to be ignored when possible. However, issue #185 showed that a specific type of error cannot be safely ignored, which required that the parser become more strict.
I strongly suspect, from the description of the issue, that the specific PDF file is malformed.
Testing the PDF @ https://www.datalogics.com/products/pdf-tools/pdf-checker/ fails ... the testing suite doesn't even recognize the file as a PDF, not to mention listing the errors.
I have been authoring and maintaining this gem by myself for over 7 years and have been looking for a new maintainer for over 2 years. The community is enjoying my work, but not really contributing, so... 🤷🏼♂️ ... please forgive me for not investing more time and effort to solve this issue.
Kindly, Bo.
There are some pull requests created that could possibly solve this problem but so far they have not been merged and the problem occurs even after almost a year after PRs were submitted.
https://github.com/boazsegev/combine_pdf/pull/209 https://github.com/boazsegev/combine_pdf/pull/215
Can you take a look at them?
Hello, getting 'RangeError: index out of range' on 1.0.23 version as well
When trying to parse this PDF _rose_production_splitpages.pdf (file was removed), we're getting error:
How we call it:
This happens on version
1.0.21
and1.0.22
however not on1.0.20
.Now we wanted to move to Ruby 3.1 and we need matrix fix which is in
1.0.22
but we cannot upgrade because of this failing PDF example.