mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.01k stars 9.93k forks source link

Huge PDF renders slow #2541

Closed jviereck closed 4 years ago

jviereck commented 11 years ago

The PDF at

http://www.vrr.de/imperia/md/content/fahrten/stadtlinienplaene/wuppertal_2012.pdf

is 5.4 MB in size and contains only one page. It's a very detailed map with a lot of shapes. No wonder this is rendered slowly using PDF.JS but it might be worth sharing for some performance benchmarking.

xavier114fch commented 11 years ago

I do think transport maps are quite useful in performance benchmarking as they are usually quite complex. I have come across these maps. Their sizes are small (774 and 409KB respectively) but PDF.JS renders very slowly. On slow machines they would just show the rivers with the spinners keep spinning.

http://carto.metro.free.fr/documents/CartoMetroParis.v3.6.pdf http://carto.metro.free.fr/documents/CartoMetroParis.v3.6.simple.pdf

gigaherz commented 11 years ago

Those (@xavier114fch's) PDFs don't load in here. I get:

[11:49:59.679] PDF 7c61e6647cf7a4f89f384648efb5de45 [1.5 inkscape, cairo and other tools / cairo 1.10.2 (http://cairographics.org)](PDF.js: 0.7.55) [11:49:59.774] Warning: TODO: graphic state operator SMask [11:50:01.324] Warning: TODO: TilingType: 1 [11:50:01.374] TypeError: commonObjs is null @ resource://pdf.js/build/pdf.js:2414

And my laptop, while not very fast, has an i7 cpu.

Sailfish commented 11 years ago

REF: http://www.floydsseafood.com/images/floyds_menu_2012.pdf

Here's another test case where it loads fairly quickly but take several minutes to render and then the completed rendering is unreadable in much of it. It loads/renders quite fast in Chrome and is displays very well.

p01 commented 10 years ago

I suspect #4817 already helped a bit, but I'll look into these PDFs.

xavier114fch commented 10 years ago

The performance did improve for the initial rendering in automatic zoom. If I change to 100% it appeared to be blurred and always get this prompt.

A script on this page may be busy, or it may have stopped responding. You can stop the script now, open the script in the debugger, or let the script continue.

Script: resource://pdf.js/build/pdf.js:5916

p01 commented 10 years ago

The blurred version you see is because upon changing zoom level, the viewer in PDF.js scales the the page if it was already rendered, before doing a proper render at the new zoom level. The "busy" dialog you see is because your browser ( Firefox ?) is hard at work rendering the page at the new zoom level.

Anyways, I'll check what's going on and what takes so long here.

fkaelberer commented 10 years ago

@p01 I had a patch #4829 ready (which I just didn't commit yet) that avoids some of the isCmd() calls in parser.getObj(). This saves maybe 20% of the time spent in parser.getObj(), or ~2% overall on this document. I hope this doesn't interfere with what you did so far.

p01 commented 10 years ago

Nope. Thanks for the heads up.

fkaelberer commented 10 years ago

I noticed that the Firefox UI reacts really slowly if many div-elements are shown, which is particularly the case in this document (the text layer has single letter divs for curved street names). When hovering over bookmark icons, selecting text etc, the Firefox UI reacts with estimated 1.5 frames per second on a fast PC.

Does anybody know if the FF team is aware of that? I couln'd find a corresponding entry at bugzilla. Google Chrome's UI also slows down noticably, but much less.

nnethercote commented 10 years ago

On my fast desktop Linux machine, the Wuppertal PDF loads in about 25 seconds in both evince and pdf.js. In evince everything is displayed all at once at the end. In pdf.js the map elements gradually appear and the whole map looks finished after about 12 seconds, and then Firefox freezes for about another 12 seconds while apparently doing nothing.

Snuffleupagus commented 10 years ago

and then Firefox freezes for about another 12 seconds while apparently doing nothing.

That sounds like it could be caused by the generation of the textLayer. If you disable it, does that prevent the browser from freezing? You can try using this link: http://www.vrr.de/imperia/md/content/fahrten/stadtlinienplaene/wuppertal_2012.pdf#textLayer=off

nnethercote commented 10 years ago

That sounds like it could be caused by the generation of the textLayer. If you disable it, does that prevent the browser from freezing?

It does prevent the freezing.

RasmusDK commented 8 years ago

Is there any news regarding this issue? I can see that it still takes a long time when I try with documents that contains complex drawings. Comparing with Adobe Reader it takes almost 3 times longer in pdf.js to open the same file. I will also welcome advices on settings etc. that can help make it faster?

RasmusDK commented 8 years ago

Has this issue just been dropped? or are there any news? In my latests test I see chromes native pdf viewer opening a complex document 3-4 times faster than pdf.js. It doesn't draw anything until the end, I don't know if that makes a difference to the performance (adobe draws the same way as pdf.js, just faster).

As mentioned before, any advice on how to make load of this type of document faster, is also very welcome.

timvandermeij commented 8 years ago

Performance improvements landed in Firefox a while ago for the canvas drawing operations to make these kinds of PDFs render faster, but I think there is also work to do about this on the PDF.js side. We need to find out where the delay is coming from exactly to see if we can optimize that part of the codebase.

Snuffleupagus commented 8 years ago

We need to find out where the delay is coming from exactly to see if we can optimize that part of the codebase.

With maps like these, there's an absolutely huge amount of rendering operations needed to draw them, which is probably the main reason for the slowness (given that canvas rendering is sequential). Perhaps canvas tiling (see issue #6419) could help somewhat here, provided that it enabled us to draw different sections of a page in parallel.

brendandahl commented 6 years ago

The PDF is no longer there. Anyone have it and want to upload?

Snuffleupagus commented 6 years ago

The PDF is no longer there. Anyone have it and want to upload?

wuppertal_2012.pdf

RasmusDK commented 6 years ago

The PDF is no longer there. Anyone have it and want to upload?

these two are still there. I am not sure how they behave in PDF.js currently though. http://carto.metro.free.fr/documents/CartoMetroParis.v3.6.pdf http://carto.metro.free.fr/documents/CartoMetroParis.v3.6.simple.pdf

SamyCookie commented 5 years ago

The log

[11:50:01.324] Warning: TODO: TilingType: 1

has disappeared, so maybe this ticket is not related to a tiling type bug anymore ?

caffeinum commented 5 years ago

I have a similar issue: our PDF are lecture slides for kids with vector animals. They are exported from Apple Keynote.

When you switch slides, new pages render sequentially in layers: first the background, then the foreground. That makes it "jump" on slow computers, where characters appear after the background is drawn.

Currently we overcome this by rasterizing: we export slides into bunch of PNG's which are then grouped into one big PDF. This works much better, but vector images look nicer.

THausherr commented 5 years ago

This one (copied from https://www.vrr.de/fileadmin/user_upload/pdf/service/downloads/tarifinformationen/verbundraum_2019.pdf ) is surprisingly slow in PDF.js verbundraum_2019.pdf

timvandermeij commented 5 years ago

@THausherr That PDF file renders in about 2 seconds for me, which I don't really find slow. Do you have details about your configuration (OS, browser, PDF.js version)?

Snuffleupagus commented 5 years ago

One PDF file per issue please, otherwise tracking becomes a mess :-)

THausherr commented 5 years ago

FF 68.0.1, Windows 10, PDF.js latest version ( https://mozilla.github.io/pdf.js/web/viewer.html ). I chose this file because it is also from Wuppertal transportation. It took about 30 seconds.

timvandermeij commented 4 years ago

Closing since the rendering of these files is now acceptable and has improved a lot since this issue was opened thanks to multiple optimizations. For example the Wuppertal one renders in around three seconds now. If there are specific problems with a file, please open a new issue.