parallax / jsPDF

Client-side JavaScript PDF generation for everyone.
https://parall.ax/products/jspdf
MIT License
29.34k stars 4.68k forks source link

Custom font does not get rendered properly in the PDF #2778

Open Shadowsusanoo opened 4 years ago

Shadowsusanoo commented 4 years ago

I am currently using jsPDF to convert some html information containing multiple language texts (Hindi, Tamil, English ,etc. ) to pdf. I use the Sakal Bharati font (For Indian languages) , which contains all these languages script in one TTF file and convert it to base64 string to utilise it via custom font method.

Sample code given below:

var doc = new jsPDF(); var pageWidth = doc.internal.pageSize.width || doc.internal.pageSize.getWidth(); var font = "AAEAAAA................AAA="; doc.addFileToVFS('SakalBharati_N_Ship-normal.ttf', font); doc.addFont('SakalBharati_N_Ship-normal.ttf', 'SakalBharati_N_Ship', 'normal'); doc.setFont('SakalBharati_N_Ship'); doc.text("घर में रहिए सुरक्षित रहिए।", pageWidth / 2, 65); doc.save("Trial.pdf");

The texts are rendered in the pdf, but not properly . For example in the first image is how the hindi text should look like.The second image represents how its rendered in the generated pdf .

How it should look Ideal

Generated content Actual

As it can be seen these 2 are very different from each other . If I convert this to a word/xml file using any online tool , the text gets rendered properly/correctly. But my query is how to make it happen for the pdf file that is generated ?

Will appreciate a fast response.

HackbrettXXX commented 4 years ago

This could be a bug of jsPDF. I think I will find the time to look at this in the next weeks. But you are of course welcome to provide a pull request or have a look at the source code yourself.

fanyufu commented 4 years ago

This could be a bug of jsPDF. I think I will find the time to look at this in the next weeks. But you are of course welcome to provide a pull request or have a look at the source code yourself.

@HackbrettXXX Hello, is there any progress about this issue? My project also meets the same problem with Sinhala language.

HackbrettXXX commented 4 years ago

Not yet.

HackbrettXXX commented 4 years ago

@kakugiki could you maybe have a look at this? My time is unfortunately very limited ATM again.

kakugiki commented 4 years ago

@HackbrettXXX I am not familiar with the custom font part. I can try when I get a chance, but no guarantee that I can come up with any solution soon though.

HackbrettXXX commented 4 years ago

Sure, that's OK. It might actually also be related to #2749.

kakugiki commented 4 years ago

It looks like the new release fixed this issue. I can't use the fontconverter though, it returns an error: Uncaught SyntaxError: Cannot use import statement outside a module. when using the converted SakalBharati_N_Ship-normal.js file.

SakalBharati_N_Ship.pdf

HackbrettXXX commented 4 years ago

@kakugiki the new font converter has an option for the module format. If you choose "ES", you should either import it from another JS module/file or include it with type="module" (requires some build tool or server to resolve "jspdf").

I don't think it is fixed. The "arcs" at the top are at the wrong place.

kakugiki commented 4 years ago

@HackbrettXXX my bad, I should have chosen UMD. Yeah, I was tricked by the computer. When I search the result directly online or copy and then search the code. it somehow automatically changed and has a perfect match. image

gologames commented 3 years ago

Hello,

May we hope that this will be fixed?

Thanks, Alex SurveyJS Team

HackbrettXXX commented 3 years ago

@gologames I personally don't have the time to look into this. So this will only be fixed if someone from the community provides a PR.

bulanni00 commented 3 years ago

The problem still exists and has not been solved What to do

HackbrettXXX commented 3 years ago

@gologames I personally don't have the time to look into this. So this will only be fixed if someone from the community provides a PR.

@bulanni00 A pull request is very welcome :)

ghost commented 2 years ago

@Shadowsusanoo I am using jsPDF version 1.5.3 and I too am facing the same issue when using devanagari font. Has it been fixed yet?

lalittolani commented 1 year ago

I am using jsPDF version 2.5.1 and I too am facing the same issue when using devanagari font. Has it been fixed yet?

CATALYST1109 commented 1 year ago

I came across this issue too for the Hindi language/Devnagri script and have created a fiddle to demonstrate it. The html page shows the correct way to display the text while the PDF shows the incorrect one. Please ignore the notosans binary font variable, I'm quite new to JS and JS fiddle and couldn't find a way to add another JS file with just the font variable.

From what I could gather reading about it online, the issue seems to be centered around 'ligatures' i.e. a compound glyph of sorts which essentially is a sequence of glyphs (characters in that script), which needs to be substituted (and even reordered ) with a different glyph for certain glyph sequences based on the rules of the script. The best example would be fi ligature in latin scripts. Some scripts like Arabic, Hindi/Devnagri ,Bangla etc have more of these occurrences than latin ones. There seem to be 2 important tables in a font file that could be used to fix this:

  1. GSUB -- this table encodes the rules of substitution for different scripts
  2. GPOS -- this table relates to the positioning of glyphs

The font used in the fiddle, Noto Sans, has support for the Devnagri ligatures and substitutions (it works fine in google docs), so I think jspdf might have to run through the glyphs once and replace and reorder them as specified in those tables. I tried looking in the library code and could see someone created a manual lookup dictionary for substitutions by hardcoding them for Arabic script -- src/modules/arabic.js , but that would naturally only work for that one script. The universal fix would be to use the font tables mentioned above.

One very good resource I found to understand the issue is this , while the scope of the article is broader than this issue, it talks about reordering the characters, their positioning and substitutions.

Hope this helps someone who is adept at JS to fix this if they so feel. In the meantime, I'll try and work on a fix and if (a very big if) it works an eventual PR. Cheers!

DeepjyotiDeb commented 1 year ago

I came across this issue too for the Hindi language/Devnagri script and have created a fiddle to demonstrate it. The html page shows the correct way to display the text while the PDF shows the incorrect one. Please ignore the notosans binary font variable, I'm quite new to JS and JS fiddle and couldn't find a way to add another JS file with just the font variable.

From what I could gather reading about it online, the issue seems to be centered around 'ligatures' i.e. a compound glyph of sorts which essentially is a sequence of glyphs (characters in that script), which needs to be substituted (and even reordered ) with a different glyph for certain glyph sequences based on the rules of the script. The best example would be fi ligature in latin scripts. Some scripts like Arabic, Hindi/Devnagri ,Bangla etc have more of these occurrences than latin ones. There seem to be 2 important tables in a font file that could be used to fix this:

  1. GSUB -- this table encodes the rules of substitution for different scripts
  2. GPOS -- this table relates to the positioning of glyphs

The font used in the fiddle, Noto Sans, has support for the Devnagri ligatures and substitutions (it works fine in google docs), so I think jspdf might have to run through the glyphs once and replace and reorder them as specified in those tables. I tried looking in the library code and could see someone created a manual lookup dictionary for substitutions by hardcoding them for Arabic script -- src/modules/arabic.js , but that would naturally only work for that one script. The universal fix would be to use the font tables mentioned above.

One very good resource I found to understand the issue is this , while the scope of the article is broader than this issue, it talks about reordering the characters, their positioning and substitutions.

Hope this helps someone who is adept at JS to fix this if they so feel. In the meantime, I'll try and work on a fix and if (a very big if) it works an eventual PR. Cheers!

Hello, I have been facing the same issue with hindi as well, would like to know if you need help or were you able to come up with anything, Thanks!

CATALYST1109 commented 1 year ago

I came across this issue too for the Hindi language/Devnagri script and have created a fiddle to demonstrate it. The html page shows the correct way to display the text while the PDF shows the incorrect one. Please ignore the notosans binary font variable, I'm quite new to JS and JS fiddle and couldn't find a way to add another JS file with just the font variable. From what I could gather reading about it online, the issue seems to be centered around 'ligatures' i.e. a compound glyph of sorts which essentially is a sequence of glyphs (characters in that script), which needs to be substituted (and even reordered ) with a different glyph for certain glyph sequences based on the rules of the script. The best example would be fi ligature in latin scripts. Some scripts like Arabic, Hindi/Devnagri ,Bangla etc have more of these occurrences than latin ones. There seem to be 2 important tables in a font file that could be used to fix this:

  1. GSUB -- this table encodes the rules of substitution for different scripts
  2. GPOS -- this table relates to the positioning of glyphs

The font used in the fiddle, Noto Sans, has support for the Devnagri ligatures and substitutions (it works fine in google docs), so I think jspdf might have to run through the glyphs once and replace and reorder them as specified in those tables. I tried looking in the library code and could see someone created a manual lookup dictionary for substitutions by hardcoding them for Arabic script -- src/modules/arabic.js , but that would naturally only work for that one script. The universal fix would be to use the font tables mentioned above. One very good resource I found to understand the issue is this , while the scope of the article is broader than this issue, it talks about reordering the characters, their positioning and substitutions. Hope this helps someone who is adept at JS to fix this if they so feel. In the meantime, I'll try and work on a fix and if (a very big if) it works an eventual PR. Cheers!

Hello, I have been facing the same issue with hindi as well, would like to know if you need help or were you able to come up with anything, Thanks!

I did some digging further --take everything that follows with a grain of salt -- turns out that the solution is not as simple as reading the gsub and gpos tables and somehow modifying the Unicode and/or it's binary or hex representation (something I was hoping for). Rather the unicode string remains the same, but needs to be rendered correctly by the text shaping engine (something it isn't doing currently), which is a much lower level operation, something libraries leave to other libraries like harfbuzz. They are responsible for converting the unicode data to the correct glyph, which includes mapping from a unicode codepoint to a glyph, then checking the glyph sequence, figuring out the necessary rules that apply(some are straightforward, some are class based. It gets convoluted), apply substitutions, change positions etc which finally gives the correct output. Even some of the substitute glyphs do not have any unicode codepoint. See this . Now I have no idea so far how JSPDF renders glyphs and what, if any, shaping engine it uses. Maybe the engine does not support substitutions/ligatures, maybe it's just an old engine? Idk at this point. Hopefully one of the contributors or library veterans can shed some light on it. I was able to parse TTF font files using fonttools and get the substitution rules from GSUB table, but as of now I don't know how to tell jspdf to start using those rules as I don't know how it is dealing with chars to glyph conversions currently. Now I might very well be wrong here, but it seems to me that the Arabic plugin that I mentioned previously uses unicode substitutions , which I'm not sure will be possible for devanagri seeing as some diacritics don't have unicode points -- 'rephdeva','ocandranuktadeva ' are some of the glyphs I could not find unicode for from CMAP table. I'll update this thread if I get anything new.

DeepjyotiDeb commented 1 year ago

@CATALYST1109 I was not sure where to proceed from here, so I just started checking out other libraries as well. Turns out a few of them have the exact same issue as well. @react-pdf/renderer - probably the one of the only libraries which has printed out hindi text correctly Pdfkit - couldnt get it to run at client side-vite/react project BUT its what is used by react-renderer so maybe they have a better engine? PdfMake - characters were printed correctly pdfLib - same issue that we see here in jspdf pdfMe - characters printed correctly but quite messed up in positioning.

Not sure if this helps out but yea, thats what I found so far, will watch out for updates in this thread

GopalKdwivedi commented 1 year ago

Hi Guys I downloaded the hindi Font from Google saved it into assests folder and able to put my language into pdf. this is worked for me. TiroDevanagariHindi-Regular.ttf saved into assests folder.

`   var doc = new jsPDF('l', 'mm', 'a4');   
    doc.addFileToVFS('NotoSansDevanagari-Regular.ttf', './assets/fonts/TiroDevanagariHindi-Regular.ttf');
  doc.addFont('./assets/fonts/TiroDevanagariHindi-Regular.ttf', 'NotoSansDevanagari', 'normal');
   // Set the font for Hindi text
   doc.setFont('NotoSansDevanagari');

  doc.text(" आइए पढ़ते हैं हमारी दिल को छू लेने वाली हिंदी ", 100, 10);
  doc.setFontSize(11);

 doc.setTextColor(100);
autoTable(doc, { html: table, showFoot: 'lastPage' , 
 styles: {
  font: 'NotoSansDevanagari',
  fontStyle: 'normal',
  }
 });
 doc.save("Report.pdf");

`

zdettwiler commented 1 year ago

Hi!

I believe I'm having the same issue with Hebrew. It's like the dots and lines above/below/inside consonants (vowels, accents, etc.) are all shifted to the left. I've tried with different fonts: it's always the same problem, with varying degrees.

I'm working with the latest jsPDF version. It's working fine on the browser with the same font (here Times New Roman):

@GopalKdwivedi, how did you manage to make it work? Is it the autoTable?

Maybe this helps to find a solution...

DeepjyotiDeb commented 1 year ago

Hi Guys I downloaded the hindi Font from Google saved it into assests folder and able to put my language into pdf. this is worked for me. TiroDevanagariHindi-Regular.ttf saved into assests folder.

`   var doc = new jsPDF('l', 'mm', 'a4');   
    doc.addFileToVFS('NotoSansDevanagari-Regular.ttf', './assets/fonts/TiroDevanagariHindi-Regular.ttf');
  doc.addFont('./assets/fonts/TiroDevanagariHindi-Regular.ttf', 'NotoSansDevanagari', 'normal');
   // Set the font for Hindi text
   doc.setFont('NotoSansDevanagari');

  doc.text(" आइए पढ़ते हैं हमारी दिल को छू लेने वाली हिंदी ", 100, 10);
  doc.setFontSize(11);

 doc.setTextColor(100);
autoTable(doc, { html: table, showFoot: 'lastPage' , 
 styles: {
  font: 'NotoSansDevanagari',
  fontStyle: 'normal',
  }
 });
 doc.save("Report.pdf");

`

It works fine for the easier text, but if there are any words with complex ligatures in the string then it fails. An easy way to test this is to print the word "परीक्षा" in it and check whether it is being output correctly.

azmain commented 1 year ago

I am facing the same issue with Bangla Unicode Joint Letter. How to solve this problem? image