Open ghost opened 11 years ago
PDF embedding is not supported at the moment, so the output above is the expected result.
We could use Poppler to render PDF files to cairo surfaces and use that as images, then render them back to PDF. But that’s just silly, "real" PDF embedding like TeX does just copies low-level PDF objects. Unfortunately I don’t know if this is possible with cairo.
Ideally, any selectable (i.e. copy-pastable) text in embedded PDFs should be remain selectable in the output PDF.
From https://www.cairographics.org/cookbook/renderpdf/ it actually doesn't sound like such a bad idea to use poppler. It says:
When using a vector backend, the vectors and text in the PDF file are preserved in the output as vectors. There is no unnecessary rasterization.
In a discussion on IRC it was just noted that poppler is GPL-licensed, so cannot be used by weasyprint.
In a discussion on IRC it was just noted that poppler is GPL-licensed, so cannot be used by weasyprint.
This sounds like it could be FUD?
WeasyPrint is licensed under the modified BSD license (aka 3-clause BSD license), which is GPL-compatible.
It's entirely possible I'm missing something important, but please could you explain why you think Poppler's license means that Poppler "cannot be used by WeasyPrint"? Thanks!
This sounds like it could be FUD?
It may be.
Actually, it's really explicit in Poppler's README:
Please note that xpdf, and thus poppler, is licensed under the GPL, not the LGPL. Consequently, any application using poppler must also be licensed under the GPL.
Actually, it's really explicit in Poppler's README:
Please note that xpdf, and thus poppler, is licensed under the GPL, not the LGPL. Consequently, any application using poppler must also be licensed under the GPL.
Interesting. Strictly speaking, this depends upon how the GPL-ed package would be "used". GPL packages and 3-clause BSD packages can be shipped together (e.g. in a distro like Debian) and can specify each other as dependencies in either direction (e.g. via a package manager like Apt).
I see these potential ways forward:
This sounds like it could be FUD?
Let’s not accuse each other of ill intentions when there’s probably only a "logical shortcut" in a terse message.
https://www.gnu.org/licenses/gpl-faq.html#WhatDoesCompatMean says:
[To say a license is “compatible with the GPL”] means that the other license and the GNU GPL are compatible; you can combine code released under the other license with code released under the GNU GPL in one larger program.
All GNU GPL versions permit such combinations privately; they also permit distribution of such combinations provided the combination is released under the same GNU GPL version. The other license is compatible with the GPL if it permits this too.
Emphasis is mine. This means that Poppler (GPL-licensed) cannot be used in WeasyPrint without effectively changing the license of WeasyPrint to GPL.
Call Poppler from WeasyPrint in a way that doesn't violate the GPL, if that is possible.
As far as I know, whether that would be compliant with the GPL is very open to legal interpretation. So I’d rather not attempt it.
Let’s not accuse each other of ill intentions when there’s probably only a "logical shortcut" in a terse message.
Agreed. I did not intend to impute ill intentions, and apologies if my reply came across that way.
Call Poppler from WeasyPrint in a way that doesn't violate the GPL, if that is possible.
As far as I know, whether that would be compliant with the GPL is very open to legal interpretation. So I’d rather not attempt it.
I understand your caution, but it really might be viable: https://www.gnu.org/licenses/gpl-faq.html#MereAggregation
Anyhow, I'm not an expert on this. The FSF Licensing & Compliance Team exists to help answer precisely this sort of question, so I'd suggest emailing them with your concerns, to see if they can suggest a low-friction, compliant solution. After all, while copyleft is important to the FSF, facilitating the creation, refinement and dissemination of free software (like WeasyPrint) is even more important to them, IIUC :)
(A case in point is that even the GNU project includes some code that is under non-copyleft free software licenses: see Appendix C.)
This sounds like a very grey area:
Where's the line between two separate programs, and one program with two parts? This is a legal question, which ultimately judges will decide. […] But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program.
IMO a cairo surface containing vectors sounds like a "complex internal data structure".
IMO a cairo surface containing vectors sounds like a "complex internal data structure".
I'd be interested to see if the FSF would confirm that. Alternatively, what about:
- Re-license future versions of WeasyPrint under the GPL (or better yet, the AGPLv3), and use Poppler.
It looks like WeasyPrint has 29 contributors. Of these, several seem to have made only trivial (i.e. non-copyrightable) commits. So the number of remaining contributors from whom you would need consent to re-license WeasyPrint isn't huge, and might be worth contacting if you're amenable to this option?
I am not interested in changing WeasyPrint’s license.
I am not interested in changing WeasyPrint’s license.
+1.
The poppler devs took the time to explicitely write that we should only use the lib in a GPL software. Legal, not legal, I don't care, I think that we can at least respect what they wrote.
Now, let's go back to our problem.
Another (far from ideal) solution is to convert the PDF to one (or more) SVG image(s). This can be done before calling WeasyPrint, with Inkscape for example. You can then use <img>
and even take benefit from the size negociation algorithm. This solution doesn't add extra rasterisation but the text is not selectable (may be fixed by Kozea/CairoSVG#80 if anyone is interested).
Another solution (to this and many other things) would be to ditch cairo and write PDF files directly ourselves. I’ve long wanted WeasyPrint to do that but it would be a lot of work, especially around fonts.
You might find https://github.com/simoncozens/libtexpdf interesting.
Oh, you don’t want GPL. Sorry, nevermind.
For background, see https://github.com/Kozea/WeasyPrint/issues/51
With latest version of cairocffi, attempting to embed PDF produces the following Terminal output under OS X 10.6.8:
The resulting output PDF contains the other parts of the input HTML content, but does not contain the embedded PDF.