ariya / phantomjs

Scriptable Headless Browser
http://phantomjs.org
BSD 3-Clause "New" or "Revised" License
29.47k stars 5.76k forks source link

PhantomJS 2: PDF rendering too large, page.zoomFactor doesn't work #12685

Closed thomasbachem closed 8 years ago

thomasbachem commented 9 years ago

I compiled PhantomJS 2 HEAD on OS X 10.9.5 (MacBook Pro Retina) via brew install phantomjs --HEAD.

When rendering a PDF via rasterize.js, the page contents are rendered much larger than with PhantomJS 1.9, and using the zoom argument doesn't change anything at all.

Experimenting with paperSize, the page contents that do usually fit exactly into 210mm (A4) do now need 303mm, so there's a 144% increase in size.

elgarfo commented 9 years ago

+1 experiencing exactly the same problem. compiled from source (eddb0db1d253fd0c546060a4555554c8ee08c13c) on debian 7.6

any idea how to work around this?

edit: used css to work around the problem for now downsides: rendering gets a little ugly and its still a pixel too wide i'd say

html {
    zoom: 0.68; /*workaround for phantomJS2 rendering pages too large*/
}
mgartner commented 9 years ago

I'm seeing this issue too. It is especially frustrating that I'm trying to use 2.0 to get webfont support, but normal HTML content doesn't fit into PDFs like it did so nicely in 1.9.x.

I'd really appreciate it if anyone could point me in the right direction to what might be causing this in the PhantomJS code base. I'd glady work on submitting a PR to fix this.

thomasbachem commented 9 years ago

@ariya I fear this bug will be overlooked quite easily regarding the amount of open tickets. Perhaps you can mark this ticket properly as a regression?

9point6 commented 9 years ago

+1 still broken in 2.0 brach

bompus commented 9 years ago

Still broken for me also. The elgarfo patch works, but it's hacky :)

ariya commented 9 years ago

@thomasbachem Good point, labelling it now. Thanks!

thomasbachem commented 9 years ago

@ariya Thanks, I just linked two duplicate tickets.

@polarathene mentions some observations in #12936 that may be of help, though I couldn't verify those myself:

I've just done some thorough testing of 2.0.0 PDF conversions and noticed the paperSize 'A4' definition that was 992px wide(120dpi) in 1.9.8 is now 595px wide(72dpi) in 2.0.0. Previously in 1.9.8 your unit values other than % had were scaled down, so that the PDF version required you to zoom to 125% to view units at their original size, also caused the whitespace issue on the right present in your picture due to that scaling. In 2.0.0 units don't appear to be scaled down, they are 100% matching to the browser. paperSize however is now scaling up from it's given px size by 1/3rd, this is matching it to 96dpi/96px per 1 inch which is a CSS inch. Thus 72DPI 595px A4 paperSize is resized to 794px 96DPI A4 when rendered. I've also learnt from OSX(and likely Linux) users that 1.9.8 provided 100% match for their PDF renders but with 2.0.0 they're having to downscale their units from 96dpi to 72dpi, eg 794px wide element needs to be changed to 595px for correct 100% zoom A4 PDF, for comparision on 1.9.8 with windows users that meant changing from 96dpi to 120dpi, 794px wide element needed to be 992px wide. Also worth noting is that between both versions % units would still remain precise, 90% element would always be 90% of the PDF at 100% zoom, for whatever reason they were not scaled like other units.

mkrn commented 9 years ago

+1 Version 2.0 fixes Function.prototype.bind but breaks paperSize)

Feendish commented 9 years ago

Also experiencing this issue. I'd love to switch to v2 to fix the page-break-avoid:inside issue but this is a problem now.

elgarfo's tweak seems to work but as noted it causes rendering issues.

Has anyone a better work around? Are there adapted pixel settings for A4&Letter? I'm on Unix.

polarathene commented 9 years ago

@Feendish I wrote a lot on the mentioned issue above, but if you read the workaround I gave for mac it should work for unix.

jimclarkuk commented 9 years ago

@Feendish Following on from the analysis by @polarathene we opted for a transform which seemed to result in fewer rendering issues than using zoom.

.page {
    transform-origin: 0 0; 
    -webkit-transform-origin: 0 0; 
    transform: scale(0.75); 
   -webkit-transform: scale(0.75);
}
Feendish commented 9 years ago

@polarathene thanks. You gave a comprehensive break down. I just couldn't follow it exactly.

I tried using the dimensions_width formula you suggested but it was still off.

In the end given your note that the DPI is 72 in Mac/Unix I use a DPI to pixel calculator http://www.hdri.at/dpirechner/dpirechner_en.htm and just hard coded the pageSize to 595x842 for A4 Portrait. 595px = 8.26772 inches x 72dpi where 8.26772=210mm.

@jimclarkuk I tried your solution too with no luck. Page breaking was messed up with overlapping elements.

thomasbachem commented 9 years ago

@Feendish What exactly do you mean? Setting

page.paperSize = { width: '595px', height: '842px', margin: '0px' };

Doesn't change or fix anything for me under Mac OS X 10.9. The page is still too small compared to 1.9.8.

Feendish commented 9 years ago

@thomasbachem I'm building the HTML from scratch to test it. Haven't tried running it on established HTML source yet.

I used Bootstrap 3 to make a sample long Invoice.

JS code-> http://pastie.org/10011943 HTML source -> http://s000.tinyupload.com/index.php?file_id=97390552363329839541 Bootstrap override.css -> http://pastie.org/10011951

It now generates a clean A4 sized portait PDF of 44 pages.

I'm on Linux (Centos 6.4).

polarathene commented 9 years ago

@thomasbachem It's been a while but from memory that's the dimensions for A4 at 72dpi(Mac/Linux): http://www.a4papersize.org/a4-paper-size-in-pixels.php

Setting A4 as your papersize would have the same effect. I don't have a mac and haven't tested on linux, what are the px dimensions of an A4 pdf document for you at 100%? On windows they're A4 at 96dpi, I'm guessing you get 72dpi(595x842)? It's been a while but I think you need to upscale your viewport from 72dpi px to 96dpi. On 1.9.8 I used a similar technique that @jimclarkuk provided, though mine scaled up(120dpi to 96dpi). The windows workaround on 1.9.8 wasn't perfect however, if you can get away with it, you should be able to adjust the paperSize to fit your viewports(probably not the same px width/height) and then alter the zoom on the pdf viewer to see the document as intended.

Another alternative could be to run a windows VM or use a web service like Azure to run Phantom on a windows instance.

Feendish commented 9 years ago

Just to clarify @polarathene on Linux setting "A4" as paperSize doesn't work. The content is sliced off on the right hand side.

I have to explicitly set the pixel width&height to get it to work on Linux with v2.0.0

polarathene commented 9 years ago

@Feendish what viewport size are you using with your papersize?

Feendish commented 9 years ago

I'm not setting a viewport. I always assumed it was one or the other based on Phantom examples.

Should I be setting one? In 1.9 branch it worked without viewport.

polarathene commented 9 years ago

I'd be interested to know if your results are different after setting the viewport. If you get a full A4 pdf filled with the website, also try setting your viewport to half just to confirm that you're getting half once it's rendered to pdf format.

My understanding is that there is a default viewport size, I can't recall what that is though. Plenty of responsive sites will adjust based on the viewport you provide, as well as fixed width sites. Both I imagine can be affected by having a poorly chosen viewport size?

thomasbachem commented 9 years ago

@polarathene @Feendish I tried to play around with setting different viewport sizes, and it changed nothing with PhantomJS 2.

polarathene commented 9 years ago

@thomasbachem it would depend on the site you're rendering. If you use it on a responsive site that has a mobile layout at a small viewport, changing your viewport to a small size should trigger it just like resizing your chrome window would. I think rendering will still scroll the viewport if needed to fill in the papersize? I'll be using phantom again soon, perhaps can set up an example project for the issue with workaround :)

Honestly though, the issue is with phantom, with 2.0.0 osx/linux got the problems windows had with 1.9.8, while on 2.0.0 windows works like osx/linux used to. Whomever worked on that part of phantom should be able to provide a fix, even if it's a different build which breaks windows, I'd imagine that'd be the quick fix.

polarathene commented 9 years ago

If phantomjs itself is the cause, I can only see tinkering with the values here: https://github.com/ariya/phantomjs/blob/2.0/src/webpage.cpp#L1061 that seem like they'd be relevant, but it might actually be handled by QT which was updated with 2.0.0. I see plenty of references for 96dpi, perhaps dpi handling has changed between the QT versions used in 1.9.8 and 2.0.0....which'd mean phantomjs won't ever fix this issue until QT does? I have no QT experience, if someone from the phantomjs team could chime in, is it a QT bug or has phantomjs done something differently with pdf rendering via QT since?

Feendish commented 9 years ago

Setting a viewport of

page.viewportSize = { width: 595, height: 400 };

has no effect on the rendered page.

polarathene commented 9 years ago

@Feendish try 100x100,if you're still getting no difference then tweaking viewport won't help much. Again I did say it completely depends on the website design itself. For the website I was working with, js scripts were generating highchart graphs based on viewport width, they rendered incorrectly without setting the viewport properly.

jrf0110 commented 9 years ago

+1 this sucks really bad :(

vitallium commented 9 years ago

Can someone provide a working example? So can take a look on it.

thomasbachem commented 9 years ago

@Vitallium That'd be great, thanks in advance! There is a test case in #12936.

You can also just compare the PDF output of any web page with rasterize.js in 1.9.8 and 2.0.0.

floundies commented 9 years ago

just throwing my hat into the ring... also have this issue. switching back to 1.9.8 for the time being.

dhwaneetbhatt commented 9 years ago

@floundies

paperSize: {
  width: (width * (72/96)) + 'px',
  height: (height * (72/96)) + 'px'
}

This seems to solve issues of broken PDF rendering on 2.0, the reason being they have made changes to the DPI in some way, detailed discussion here: https://github.com/ariya/phantomjs/issues/12936.

Let me know if that works for you.

thomasbachem commented 9 years ago

@dhwaneetbhatt Doesn't work for me. Why reduce page dimensions when the rendering is too large already?

polarathene commented 9 years ago

@thomasbachem You are using OSX/Linux? From what I understand you have 72dpi being used by the OS, and the browser displays at 96dpi. So when you provide px values those get scaled up to 96dpi and then rendered to PDF, @dhwaneetbhatt is reducing those values to 72dpi so that Phantom will return them to the size you wanted at 96dpi as you see in the browser. Alternatively provide the value in inches and multiply by 72, should have a similar effect.

So just to clarify, Phantom appears to upscale your px(and other units besides %) values to 96dpi, take the pic then save it back to your pdf which the image is too large for the original intended size. When you provide the size in 72dpi it should be corrected and look right. I use windows so I cannot verify this, on 1.9.8 I previously had to change my values from 96dpi to 120dpi due to the rendering being too small. This wasn't perfect though, and needed some extra code to scale css or text I think....which still wasn't 100% correct for some websites.

thomasbachem commented 9 years ago

@polarathene Yes, I'm using OS X 10.9. I get what your saying, and it sounds reasonable, but it doesn't work on my environment.

As noted above and also observed by others here, the upscale factor is 1.44, so you need to e.g. set the CSS zoom property to 0.69 (= 1 / 1.44) to fix it. Dividing 72 and 96 dpi (= 0.75) doesn't lead to this number however.

Furthermore @dhwaneetbhatt proposed to reduce the page dimensions, but the content is rendered too large already, so we if we were going to work around this by manipulating page size (which sucks as I want an A4 PDF) we'd need to increase page size.

polarathene commented 9 years ago

A4 at 72dpi should roughly be 597px by 842px, try using that as your paperSize? I'm assuming that when you provide paperSize as A4 it's using a different value? I see the zoomFactor solution as a bit odd at 0.7, fairly certain it's due to DPI, whatever caused the problem for OSX/Linux users with 2.0.0 fixed the same issue for Windows users in 1.9.8, I'm pretty sure that either PhantomJS or QTWebkit has changed the way it handles DPI.

Again to clarify, you are reducing the values so that Phantom renders at the correct values. With Windows we had to increase the page dimensions for content that was rendered far too small and this was a DPI issue with units. zoomFactor only scaled a portion of the the website correctly(text I think) when I tried that prior to rescaling the units before passing on to Phantom. It might have changed or be different for OSX/Linux users however.

Worth noting, depending on the website you're rendering, you may want to also adjust your viewPort size as well, I can't recall if I matched them or if one of these had not been affected by the rescaled units. What you need to keep in mind is that Phantom is adjusting your units prior to rendering, and that an A4 PDF has DPI of 96, at least on Windows which is a width of 8.3in to 797px at 100%.

Just to verify, does increasing the paper size work for you as you are stating it does? Or does reducing it have the desired effect?

lasterra commented 9 years ago

My solution

           var zoom = page.zoomFactor;
            page.evaluate(function(zoom) {
                    document.getElementById('body').style.zoom=zoom;
             },zoom);
            page.render(output);
yorickpeterse commented 9 years ago

Also experiencing this problem when using Phantom 2.0 on Linux, using 96 DPI. I'm using the following paper settings:

page.paperSize = {
    format: 'A4',
    orientation: 'portrait',
    margin: '1.5cm'
};

Rendering a report without any zoom settings (as mentioned by @elgarfo) results in my document being rendered as following:

broken

The resulting PDF spans multiple pages in the above screenshot. If I apply the following CSS:

body {
    zoom: 0.53; /* using 0.55 or higher results in the elements not lining up correctly */
}

I instead get the following PDF:

correct

This particular PDF only spans a single page (as intended, clipped off in the screenshot). The downside is that some text is a bit fuzzy and the spacing between certain elements is a little bit different.

polarathene commented 9 years ago

@YorickPeterse Can you read my previous message and try setting the paperSize dimensions yourself instead of 'A4'? If that doesn't work can you also try adjusting the viewportSize dimensions as well? It's been a while since I've used Phantom, would be helpful to others if you can confirm my assumptions of the 2.0.0 linux/osx issues being similar to the windows issues in 1.9.8.

yorickpeterse commented 9 years ago

@polarathene Using these settings I get a PDF that does fit on a single page, although said page (and its content) are much bigger than a usual A4:

page.viewportSize = {
    width: 1030,
    height: 1500
};

page.paperSize = {
    width: page.viewportSize.width,
    height: page.viewportSize.height,
    margin: '1.5cm'
};

These particular dimensions were based on the dimensions of the PDF, I set them in such a way that the width/height are enough to fit the document into a single page. Printing the resulting PDF works perfectly fine (probably because the printer downscales the PDF), so for now this should do the trick.

andreild7 commented 9 years ago

@YorickPeterse I'm using the same environment as yourself (Unix).

My workaround for v2 was as @polarathene says to explicitly set pageSize:

page.paperSize = { width: '595px', height: '842px', margin: '0px' };

The reason for those specific numbers for A4 is that DPI for Unix is 72. A4 Portait in inches is 8.26772 x 8.26772 which translated to 595px by 842px.

Setting a viewPort has no effect.

It looks to me that the code isn't accounting for the fact that Linux is 72DPI. It assumes it's 96 which throws off all the calculations. .

yorickpeterse commented 9 years ago

@itgslabs

page.paperSize = { width: '595px', height: '842px', margin: '0px' };

These particular settings still result in content clipping out of the page:

clip

polarathene commented 9 years ago

If I recall correctly, setting the viewport size can have an effect depending how your site works. For me I had a highcharts graph that set it's size based on the viewport size given not the pagesize. I'd have to look into my old source code but I am pretty sure for 1.9.8 on Windows, one was set to the A4 at 96dpi px size, and the other was set to 120dpi(page size I think).

For linux/osx you'd be using dimensions for 72dpi as @itgslabs confirms. When rendered the pdf at 100% should match in px the dpi it's being displayed at for A4? I cannot confirm on other OS but for Windows pdfs display at 96dpi thus a width of 794px.

I do remember that I also had to apply a CSS transform in addition to this, these seem familiar: http://stackoverflow.com/a/10559205/2639089 https://filippo.io/taking-retina-screenshots-with-phantomjs/

You'd just want to do the opposite of scaling up. An alternative also might be SlimerJS.

davidwindell commented 9 years ago

Doh! I just wasted a few hours getting 2.0 built and ready on Ubuntu 14.04 to discover A4 PDF's are doing this (cutting off the side of the page).

Does anyone have a working solution? Setting page dimensions didn't make a difference.

zackw commented 9 years ago

Do we understand to what extent this is PhantomJS .vs. Webkit's fault? Pinning that down seems like it needs to be the first step.

bbrdaric commented 9 years ago

Is it even realistic to expect the same results from PhantomJS PDF export, as they are in Chrome/Chromium print preview / Save as PDF? I mean...that would be awesome!

To get this, I think we need to match PDF export settings. As far as I can tell from Chrome export settings, that would be

Format: A4, Orientation: portrait DPI: 600 Margin left: 0.4" Margin top: 0.4" Margin right: 0.4" Margin right: 0.39"

I also tried to export some wide table and failed (table got cut off, similar to @davidwindell), but I need to investigate a bit more.

MarttiR commented 9 years ago

This not very elegant solution worked for me: pulled 1st1@de90d0712f20dd7a68eb5ac5b302e535ced5a7f4 onto release 2.0 and replaced the hardcoded 72dpi values with hardcoded 96dpi values in the function stringToPointSize().

davidwindell commented 9 years ago

@MarttiR broken link?

toabi commented 9 years ago

I guess @MarttiR means this: https://github.com/1st1/phantomjs/commit/de90d0712f20dd7a68eb5ac5b302e535ced5a7f4

mgartner commented 9 years ago

Does anyone have any idea on where this bug might live or how to start tracking it down? It's tough without any experience with this codebase, so does anyone with experience have ideas on where to start searching?

polarathene commented 9 years ago

@mgartner It's possibly not in PhantomJS codebase but a change in QT's webkit. Or look at @MarttiR 's fix. From when I worked with the issue on Windows in 1.9.8(now fixed but an issue in Linux/OSX), I noticed that the input values at some point are altered(DPI thing), and while that also transforms units other than px such as mm, cm, etc; it did not affect % 100% remained 100% of the final document size, as did 50% stay at halfway. I assume those unit transformations are consistent with Linux/OSX in 2.0.0, so I'd look for code that is handling these unit conversions.

Alternatively if @MarttiR 's solution doesn't cause any problems, define constants for each supported platform DPI and perhaps a custom DPI variable, if you can detect the platforms DPI default to that, and have a cli flag enforce using different platform DPI's or a custom DPI value. Would there be any issues with that solution?

mgartner commented 9 years ago

The company I work for desperately wants to upgrade to 2.0 to get web font support, but this bug is currently preventing us from doing so.

As we don't have in depth knowledge of this repository or WebKit, we were considering putting up some money on bountysource.com or a similar website to try to get this fixed as soon as possible. Has anyone had success with this approach, and would someone be willing to take a serious stab at it if we put some money behind it? Any idea what a fair price would be?

polarathene commented 9 years ago

@Vitallium When was this fixed in dev? I had a look at recent commits from this year and didn't notice any related to the issue. Any idea how far off a release containing the fix will be?