parallax / jsPDF

Client-side JavaScript PDF generation for everyone.
https://parall.ax/products/jspdf
MIT License
29.43k stars 4.68k forks source link

UTF-16-Support/Emoji Support #2072

Open glauberramos opened 6 years ago

glauberramos commented 6 years ago

Not sure if I'm doing something wrong, but I can't display emojis inside the PDF.

Are you using the latest version of jsPDF? Version 1.4.1

Steps to reproduce pdf.text('😍 😎 😘 😐', 10 ,10) https://jsbin.com/sohijak/2/edit?html,js,output

What I saw

screen shot 2018-11-12 at 21 56 53

What I expected 😍 😎 😘 😐

Uzlopak commented 6 years ago

According to the PDF reference(PDF 32000-1:2008 7.9.2) PDF viewer display text with ASCII/ANSI-codepage and not with Unicode. Unicode-characters use two bytes per character instead of one byte per character like in ASCII/ANSI. This forces the PDF viewers to interpret Unicode-encoded text as ASCII/ANSII-encoded text and that resulting into displaying "strange" characters.

To display Unicode-encoded text you have to add your own font to your custom-tailored PDF. jsPDF supports TTF, which have to be converted to base64-encoded strings. Precheck if your desired font supports the characters you want to use. There is no use for example in using a font lacking of arabic characters and hoping that the generated PDF will display arabic characters.

In the jsPDF-project we have a fontconverter, which you can find in the folder "fontconverter". This simple tool lets you "convert" a ttf to jsPDF needed format. It also adds necessary operations to add the font to the jsPDF runtime. You have to include the resulting javascript-file to your project and you are ready to use it for your pdf.

You have to set the font as usual with setFont and jsPDF will render your text correct.

glauberramos commented 6 years ago

I tried to use OpenSans font with the fontconverter and it is not working. Do you have a working example that I can use @arasabbasi ? I just need any font actually that supports emoji.

Uzlopak commented 6 years ago

I checked it. The emojis are UTF-16 and not UTF-8. So we dont support these symbols. Dont know how to solve this.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 90 days with no activity. It will be closed soon. Please comment/reopen if this issue is still relevant.

sorensenjs commented 3 years ago

Although this bug was originally written discussing emoji in the Supplementary Multilingual Plane, and that does pose unique challenges as these symbols require two UTF-16 values (a surrogate pair) to represent them (https://www.compart.com/en/unicode/plane/U+10000).

However, jsPDF also fails to render emoji in the BMP, simple cases like ☺ which are represented with a single UTF-16 word, no different from the arabic text which is handled specially. https://jsbin.com/boxugid/edit?html,js,output

In the case of U+263a it looks like it's simply interpreting this UTF-16 sequence as a pair of bytes and outputting the ascii characters U+0026 (&) and U+003a (:).

Uzlopak commented 3 years ago

@sorensenjs Yeah. pdf does not natively supports more than ascii. For utf-8 you need to add an font.

sorensenjs commented 3 years ago

Indeed, if I download the ugly rasterized unifont from here https://unifoundry.com/pub/unifont/unifont-13.0.06/font-builds/ this works

var doc = new jsPDF(); doc.addFont("test/reference/unifont-13.0.06.ttf", "Unifont", "normal");

doc.setFont("Unifont"); // set font doc.setFontSize(10); doc.text("А Π½Ρƒ Ρ‡β˜Ίβ˜Ίβ˜ΊΠΈΠΊΠΈ Π±Ρ€ΠΈΠΊΠΈ ΠΈ Π² Π΄Π°ΠΌΠΊΠΈ!", 10, 10);

Attempt to use the OP examples with unifont_upper which contains the necessary glyphs in the supplementary plane doesn't produce any output however.

Uzlopak commented 3 years ago

@sorensenjs

Well. I am not aware of any UTF-16 solution for PDF afaik. The PDF Reference does not mention anything about that. And this issue is about UTF-16

sorensenjs commented 3 years ago

Sorry, but that is incorrect. UTF-16 is an encoding. All unicode code points may be represented as UTF-8 (1-5 bytes sequencs), UTF-16 (1 or 2 words), or UTF-32. Just to take one example https://www.fileformat.info/info/unicode/char/1f60e/index.htm this particular code point can be represented as a 4 byte UTF-8 sequence, or a 2 word UTF-16 sequence.

Saying that any particular code point "is UTF-16" is neither relevant to this bug, nor correct.

Uzlopak commented 3 years ago

Please provide a pdf with UTF-16 smiley created with adobe. Thx.

sorensenjs commented 3 years ago

I don't use commercial products but here's a Overleaf LuaLaTeX source file that generates both monochrome and color emoji from the Supplementary Multilingual Plan based on the OP's example. https://www.overleaf.com/read/wtjzrpwbjwyv

Emoji_Demo.pdf

Still not sure how it work tbh.

sorensenjs commented 3 years ago

I believe the produced PDF contains rasterized versions of the emoji. This answer from adobe discusses the options https://community.adobe.com/t5/acrobat/emoji-in-adobe-pdf/m-p/10148090#M121259

Uzlopak commented 3 years ago

The emojis are stored as images.

sorensenjs commented 3 years ago

This other adobe post https://community.adobe.com/t5/indesign/apple-color-font-pdf-problems/m-p/10280802#M128162 includes a link to a SVF Fonts Test.pdf file that includes fully vector emojis,

I can even cut and paste them from the PDF viewer so it's all being handled correctly, even though many of these emoji's are from the SMP β†•β›ŽπŸ†”πŸ†˜βš›πŸŒ„πŸ˜€πŸ—Ώ

SVF Fonts Test.pdf

That seems like an engineering proof of concept that this is feasible.

Uzlopak commented 3 years ago

PRs are welcome

KingMarcel commented 2 years ago

Is there still no fix way to implement emojis?

I tryed it with

const font = "data:font/ttf;base64,AAEAAAANAIAAAwBQRkZU...." doc.addFileToVFS("unifont-15.0.01.ttf", font); doc.addFont("unifont-15.0.01.ttf", "Unifont", "normal"); doc.setFont("Unifont"); // set font

but this doesnt work

FarrukhKhan11 commented 2 years ago

Tried many solutions but unable to export emojis in pdf. Kindly assist if there is any solution in jspdf to export emojis is pdf doc.

Output Capture

PuneetKohli commented 9 months ago

Is there a solution here?