domenic / svg2png

Converts SVGs to PNGs, using PhantomJS
Other
585 stars 134 forks source link

Switch from PhantomJS to headless Chrome #96

Open neocotic opened 7 years ago

neocotic commented 7 years ago

I think we all know that PhantomJS is past it. It's out-of-date and has so many bugs for rendering SVG images that they are simply not fixing. It's basically not fit for purpose anymore, at least not for svg2png. I have reimplemented svg2png using headless Chrome instead using the Chrome team's puppeteer. This library provides a great API that we can leverage to manipulate the DOM via DevTools integration and it downloads the latest Chromium build on installation, just like phantomjs-prebuilt does.

The primary caveat is that I don't see a way of supporting svg2png.sync. I realise that this could be a deal-breaker, but perhaps the trade-off is worth the compromise. A major version bump would be needed, of course, as this is entirely a breaking change.

I've not finished everything yet but I thought I'd open the PR early to get initial feedback on review. The last thing I need to fix could be a deal-breaker in itself if I can't get it working; I'm trying to get the document to be aware of the file's location so that relative HREF's are resolved correctly. Honestly, I'm a bit stumped at how this was working originally with PhantomJS. I'll try to find some time do this soon.

Also, I updated some documentation based on these changes, but I'm happy to revert these changes if you feel you'd rather do such things. No problem with me.

Finally, I've just noticed that my IDE has taken some liberties with import/require ordering. Do you want me to revert those changes or are you happy with them? I personally find them better, but I have it configured this way for my projects, and this project is yours, so I'm happy to revert them if you want me to.

neocotic commented 7 years ago

I've taken a look at your comments and believe I have addressed them. I've reverted unwanted/accidental changes, changed to the async/await syntax (bumping the Node.js dependency to >=8), and simplified the conversion process by using puppeteer's screenshot API.

However, I'm still facing issues with external files. I'm going to try and play around with using temporary files to better emulate what it's like when opened in the browser. I believe the issue I'm facing may be because the location of the default page is about:blank and not actually using the file:// protocol, but I'm guessing here.

neocotic commented 7 years ago

OK. I think I've fixed the issue with external files. A temporary HTML file is created and the SVG contents are written to it (along with some boilerplate) and I simply tell puppeteer to navigate to that page. Output now includes external files. However, I don't see any existing tests for remote files (e.g. https://assets-cdn.github.com/images/modules/logos_page/GitHub-Mark.png). This might be worth testing further, but I don't know what peoples expectations are here. Personally, I would expect it to work (so I really hope it does).

I don't fancy adding a unit test that depends on a remote file though. That's just asking for trouble. Thoughts?

I'll try to find some time soon to test this further, but I'm hoping you can continue your review.

I've also left the examples within the readme as-is. I'll let you rewrite them if you feel that it's needed (e.g. in async/await syntax).

I'm also not very keen on adding the sync approach that you described. Seems very hacky and I'm not sure what benefit it has. Surely those wanting such a hack could do it themselves and we can keep our code lean and clean?

As for the fonts; I believe they're different because Chromium is handling fonts correctly and PhantomJS was not, but I'm guessing here.

neocotic commented 7 years ago

Just did some testing and remote file references work as well. I'm happy with the code and functionality. Let me know if you find any issues or want anything else changed.

domenic commented 7 years ago

I do want to get the .sync functionality back; it should be this library's job to encapsulate such ugliness, and not make users do it.

I'm OK with the temporary file approach if it's truly the only way, but I would have thought Puppetteer would have had APIs for loading some content directly from a string, not just from a file. If that's possible that would be nice... if not, maybe we should open an issue on Puppetteer (while still merging this).

The <svg> in HTML page approach seems quite nice. I still think there's some potential simplification in the sizing code we could do by relying on the browser's native sizing (e.g. I think setDimensions should no longer be necessary at all?) but I can try to look into that on my own.

Any idea why 18/19.png changed? Not a problem (this stuff is weird), just curious if you happened to have any ideas.

My hope is to merge this and apply any of my own tweaks over the weekend and do a release then. \o/

neocotic commented 7 years ago

Would you mind if I omitted the sync implementation from this PR or would you like me to take a stab at it at least?

Puppeteer does support setting HTML content of the page, however, it was failing to load external files when done this way. There is a request interceptor API that I was playing around with when I was trying the data URL approach but that wasn't working for some reason (was not being triggered for resources loaded within the SVG). That said, I can give it another shot now that we're using the screenshot approach as it could possible allow us to override requested URLs to be relative to the url option.

I might let you play around with the dimensions, if you don't mind. The viewport sizing and clip settings were needed to make it work properly though.

All the file sizes have changed. Some increased and others decreased. Not sure why but no doubt simply because we're using a separate tool to create the PNG buffer.

I'll look to fix the variable naming issue and take a quick stab at getting it working without the temp file. If you want me to include anything else in this PR (e.g. sync) please let me know.

This was my first time using async functions, so I hope they're OK. Loved using them, but I'm not sure about my error handling.

neocotic commented 7 years ago

I renamed the variable you mentioned in your review.

I also took a quick look at using the request interceptor API to dynamically resolve requests with relative URLs against the URL but my listener is still not being triggered for requests originating from within the SVG. When I load external-file.svg in my browser, I do see a request for kitten.png in the Network tab under DevTools so I would have expected the puppeteer API to report this and allow interception. Might simply be a bug on their side. If so, I recommend we go with the temp file approach initially and then raise an issue against puppeteer to get their request interceptor to correctly trigger the "request" event for internal SVG requests. If/when fixed, we can try again at removing the temp file workaround to improve performance.

However, I was playing around with this in a linux VM (running on c9.io) instead of my main linux laptop and I noticed that some of the tests were failing. I believe that this is because of some font variations on the operating systems. I was surprised that one of these Arial and more surprised that another was Segoe UI. The VM doesn't appear to have Arial installed correctly (the characters don't look very Arial to me but are not the default fonts) but does have Segoe UI installed. I think that this is why some of the fonts in the success-tests I generated yesterday looked off. I'm guessing my linux laptop doesn't have Segoe UI installed so it was falling back to the default font.

It might be good if the tests simply use serif/sans-serf unless the font is bundled (like the Roboto test) so that they are truly cross-platform and avoid such issues. After this has been merged, you might want to take some time to generate the success-tests images on your machine since you'll know best what the aim is.

I think/hope that's my work here done :smile:

neocotic commented 7 years ago

I've just fixed the expected output for tests using the Segoe UI font. Let me know if there's anything else you want me to do.

neocotic commented 7 years ago

Any update on this? I'm not sure if there's any action expected on my part.

domenic commented 7 years ago

Well, I don't plan to merge this until I have time to work on a sync version and investigate simplifications to the sizing logic, which could be a week or two.

I'm surprised about all this talk of an interceptor API. I would have anticipated Puppetteer would have had a way to just set the contents from a string and set the URL at the same time, thus making URL resolution just work.

pkra commented 7 years ago

I would have anticipated Puppetteer would have had a way to just set the contents from a string and set the URL at the same time, thus making URL resolution just work.

FWIW, I did get around to testing data URLs with pupeteer's goto method and they work.

neocotic commented 7 years ago

Personally, I feel like the PhantomJS approach was hacky by allowing a dummy URL to be used without navigation. Chrome's approach seems to be more inline with how browsers actually work.

I don't see how data URLs would work with embedded files.

niftylettuce commented 6 years ago

would be nice to have this, currently other solutions such as sharp have cross-platform issues

neocotic commented 6 years ago

I had an immediate requirement, so I gave up and made my own library that does exactly this: https://github.com/NotNinja/convert-svg.

pkra commented 6 years ago

Chromy is also quite convenient (just opening the SVG directly and screenshotting it), cf. https://github.com/OnetapInc/chromy/blob/master/examples/screenshot.js