sumatrapdfreader / sumatrapdf

SumatraPDF reader
http://www.sumatrapdfreader.org
GNU General Public License v3.0
13.72k stars 1.73k forks source link

Very slow annotation detection #4562

Open Mitia99 opened 1 month ago

Mitia99 commented 1 month ago

SumatraPDF version image

Describe the bug Very slow annotation detection

To Reproduce Steps to reproduce the behavior:

  1. Open attached file
  2. Right click anywhere
  3. Click on Edit annotations

Expected behavior Annotations window shouldn't take too much time loading

File that reproduces the problem File sample(143MB)

Additional context Processor : Intel Core i5-1130G7 RAM : 16.0 GB at 3733 Mhz 512 Go NVME GEN 3 WIndows 11 23H2 latest stable

I would assume that the file is large but 3min23sec is quite long though.

Thanks for investigating,

GitHubRulesOK commented 1 month ago

/Size 61954 that is a lot of objects and looks at first view as perhaps not an original version pdf:Producer>Adobe Acrobat Pro (64-bit) 24 Paper Capture Plug-in with ClearScan</pdf:Producer However there are many dates showing modifications and final one seems to be PXC-Ver:10.4.1.389-Date:20240927100123 Thus suspect it has been affected by fiddling.

Over 1100 fonts ! when only a dozen or so may be expected from a publication however scanning would explain that

47250+ unseen structure elements (good for audio but no use to the sighted)

However if I try to annotate with Edge (powered with Acrobat) it crashes, so something is seemingly wrong with that file in that respect. MuPDF is having problems with that format it is having to slowly parse so much data, so that will affect SumatraPDF.

Mitia99 commented 1 month ago

Over 1100 fonts ! when only a dozen or so may be expected from a publication however scanning would explain that

I'm not sure that it's related to fonts. I mean I have other PDF with way more fonts like this one (only two annotations though) Sample V2.pdf

image

so something is seemingly wrong with that file in that respect.

Well, I can reproduce the issue with other PDF files

GitHubRulesOK commented 1 month ago

I did not mean that fonts were the cause in this case as it is a scanned file and thus not unusual to see one or two fonts per page just that it all adds to numeric objects to be considered for working with.

The core of my comments was Acrobat in Edge crashes when touching annotation.

kjk commented 1 month ago

The file is no longer available.

Mitia99 commented 1 month ago

@kjk

I updated the link but unfortunately I already removed all annotations in the file.

That said, Annotations window is still very slow to load so I think you'll still be able to investigate.

GitHubRulesOK commented 1 month ago

@kjk AFAIK there were no Annotations (showing ? deleted ?) in the first file it was opening annotations dialog was the problem I cannot be 100% certain the file is the one that caused Edge such problems but here is the copy from a week ago

There are no incremental additions so should be the basic unedited file HOWEVER had been edited at that time? %PXC-Ver:10.4.1.389-Date:20240927100123-SHA:088793F9:948A58FED5430ED6725F51F636CB868090ED34A68E72F1200707B38F01DD6C9E /Info 1 0 R /Length 24110 /Root 2 0 R /Size 61954 /Type /XRef

https://filetransfer.io/data-package/L710TOd1#link