Closed NickHarnau closed 1 year ago
@NickHarnau thanks for reporting this issue. It turned out to be related with the table recognition algorithm that, in particular, analyses all vertical and horizontal lines on the page. In this particular document there was a lot of line art, which causes degradation of performance. We have fixed this issue now ignoring all lines that are located inside Figure
structure element.
The latest dev version of veraPDF (1.23.149), also available at https://verapdf.duallab.com does already include this fix.
Thanks for taking care of it! :) How can I get the latest dev Version? I mean the docker-compose file gets the images from e.g. ghcr.io/verapdf/IMAGE -> I tried to run it locally in my Docker but it still faces the issue. I assume you have to update this the next days? :)
@NickHarnau Hey, you can download latest images from here: worker, file-storage and job-service
Hey :) The following PDF took nearly 10hours to create a Report in Docker. I had https://www.heilbronn.de/fileadmin/user_upload/DV_Dienstleistungen-Amt62_2021.pdf
Is this just a very complicated PDF or are there other issues? When I try to upload this file on duallab it also takes a very long time and on pdfchecker.nl I am even getting an error uploading the file.