parallax / jsPDF

Client-side JavaScript PDF generation for everyone.
https://parall.ax/products/jspdf
MIT License
29.41k stars 4.68k forks source link

Utf8 support #12

Closed shkuropat closed 6 years ago

shkuropat commented 12 years ago

Is it posible to use non-latin (e.g. cyrillin) characters?

dvdotsenko commented 12 years ago

As of June 2012 jsPDF still does not support the version of Unicode (UCS-2 BE / UTF16 BE) PDF format allows. This means, that almost all characters outside of ASCII are pretty much broken. There are plans to add support for UCS-2 streams to jsPDF, maybe somewhere over this summer - 2012. It's a lot of work.

dvdotsenko commented 11 years ago

Nowhere at this time. It's still too much work. Unicode will not be here until code is added to support embedding (and subsetting) unicode fonts within PDF. That alone is a gargantuan task and a logistics issue since it involves pulling binary blobs into browser and it inflates the size of PDF.

Unless someone with a skill really needs this, or some party is willing to sponsor this move, it's unlikely to happen naturally.

sguzgon commented 11 years ago

Ok. It would be good to have support for UTF-8. I'll have to find some alternative. Thanks!

bluebluesky commented 11 years ago

Dear Dvdotsenko : How about the UTF8 support now? I also failed to show Chinese with jspdf . Jspdf is such a good tool , it's so pity. So could u give me some suggestions how to solve this issue?

Thanks in advance jessie

zeligmanos commented 11 years ago

any news about Unicode support? jsPDF is really fantastic tool but without Unicode support it's extremely limited...

Mulegoat commented 11 years ago

@dvdotsenko when you say 'almost' all characters, would that include UL list items like a disc? This seems like the most basic character requirement which i cannot get with ASCII encoding. All bullets get stripped out of the PDF when using the fronHtml method. Is there any kind of work around to include this character?

kingdom91 commented 11 years ago

What about Cp1250 ? Is there any chance to support?

deedarb commented 10 years ago

Unfortunately can not use for production without UTF-8 support

PrakashInn commented 10 years ago

When it could be supported ? . utf-8 support is needed desperately.

albinotonnina commented 10 years ago

I was so sure about utf-8 support...I didn't check and I started using the library. Damn.

defel commented 10 years ago

strange, had utf-8 support on my prototype, but its gone as soon I use it on production ..

Is there are any plugin or lib I forgot to include?

MrRio commented 10 years ago

There's a fair few characters that appear to work. Full unicode support isn't in there though. It's something that we're working on. ᐧ

On Mon, Mar 10, 2014 at 5:47 PM, defel notifications@github.com wrote:

strange, had utf-8 support on my prototype, but its gone as soon I use it on production ..

Is there are any plugin or lib I forgot to include?

— Reply to this email directly or view it on GitHubhttps://github.com/MrRio/jsPDF/issues/12#issuecomment-37211898 .

James Hall Director

Parallax

+44 113 322 6477 http://parall.ax/

Registered office: The Old Brewery, High Court, Leeds, LS2 7ES Registered in England no. 07430032 VAT No. 101 3405 84

defel commented 10 years ago

Ok, I forgot to include the file jspdf.plugin.standard_fonts_metrics.js .. everything works again as in my prototype

legikaloz commented 10 years ago

@defel Can you post a code snippet how do you create documents with utf-8 support?

luciash commented 10 years ago

@defel I wonder how you made UTF-8 working by including jspdf.plugin.standard_fonts_metrics.js too. It does not work for me with "ěščřžýáíé". (result is the same as without that plugin: "šYžýáíé")

defel commented 10 years ago

@luciash as @MrRio wrote, there is no thing like UTF-8 support in jsPDF. But some basic unicode characters work, like ü or ².

dave-watts commented 10 years ago

Ok, so no UTF-8 support, but please mention this on the jsPDF site, so we don't waste our time developing on something that is not going to work!

D3CK3R commented 10 years ago

Hey guys ... so as a result of the discussion above ... is there any chance that this will be implemented somehow and when could this happen?

Pomax commented 10 years ago

I think one can safely assume this is never going to happen, and finding an alternative library is the best course of action. The maintainers clearly simply don't care about it (which is usually an indication the project is no longer used by those maintaining it, and it's time to fork it if you do need to use it, and need its functionality extended)[

defel commented 10 years ago

@Pomax do it, fork it, implement UTF-8 support - everyone here would be happy to see this feature .. maybe you can also submit a pull-request then?

Pomax commented 10 years ago

@defel if I had time I would, I'm just posting here in behalf of someone who just wanted to use this project and came away with it angry (I'm involved in too many projects as is to take on a fork of a project I don't personally use either. I'd be supporting it just as much as it currently is, no one wants that)

D3CK3R commented 10 years ago

@luciash have you tested pdfmake with kanji for example and e unicode font?

luciash commented 10 years ago

@D3CK3R nope, i tested just my own language... you can test it. i declare it your homework ;-)

Drtikso commented 10 years ago

Workaround: -canvas context drawing supports unicode -ctx.fillText("Hello WorldDDDDüß!",10,50); works -use canvas2image (very small js library) -draw image to pdf You would be able to put it all in a simple function and voila.

It's a win if you only plan to use the pdf file for printing As for viewing the pdf file you loose the option of selecting the text(copy, paste etc...)

diegocr commented 10 years ago

@bluebluby If you are fine with that approach, you don't need any third part library such as canvas2image, just pass the canvas element to the addImage() function. Or if you're dealing with HTML use the addHTML plugin instead of fromHTML (check out the live demo with Chinese characters there)

dave-watts commented 10 years ago

I tried to do this (modified the DOM in the browser console to add an id to text)

 <p id="chinese">十五向學,三十而立,四十而不惑,五十而知天命,六十而耳順,七十而從心欲,不踰矩.</p>

 var pdf = new jsPDF('p','pt','a4');

  pdf.addHTML(document.getElementById('chinese'),function() {
   var string = pdf.output('datauristring');
   $('.preview-pane').attr('src', string);
 });

Didn't work, for some reasons the text is dark gray on a black background!?

This could be a good solution in theory as you can use javascript to add and remove elements to the DOM

D3CK3R commented 10 years ago

Thanks for the suggestion ... i thought that we already tried this. The problem was that the text was unsharp when rendered with canvas. Any ideas how to fix this?

sharkhat commented 10 years ago

I have been working with jsPDF for a few weeks now. It has support for very basic 'special' characters, such as 'µ, º' and a few others, but characters like 'σ' and 'α' get rewritten to odd characters, when copying them from the document to here, they show up as the correct characters, but in the pdf, they are some odd replacement. When this replacement occurs, in firefox, the entire line gets a square with 4 dots inside between each letter, and chrome puts a space between each letter. The only way I can think to work around this is to look at the string before it is passed to jsPDF and convert that character to something it understands.

http://imgur.com/yWb5sQ9

Here is an image of what firefox spits out, If i knew the Unicode for that thing, I could strip it out.

diegocr commented 10 years ago

To me it looks good enough:

14321123235

In any case, if there's some issue while using html2canvas we can also use rasterizeHTML instead, check the addHTML source for further details and choose whatever works best for you all.

@sharkhat fwiw, see #320

sharkhat commented 10 years ago

@diegocr Thanks, I was unable to strip out the spaces, i have been able to look for certain chars that cause the error, σ,⁺,², etc. but it is pretty unfeasable to make a case for every char that may be used. I am now attempting to copy the new addHTML. Is It necessary to use an iFrame? I believe that is what 'preview-pane' is. I am having trouble getting output like the demo.

dave-watts commented 10 years ago

@diegocr if you do the entire html page as pdf using addHTML it works ok, but just try it with a single element, like my example (this is what's needed to facilitate UTF8 Support) and for some reason the text is on a dark background.

diegocr commented 10 years ago

@sharkhat This might helps: https://github.com/ashtuchkin/iconv-lite

As for the iframe thing, no, it isn't needed - that's used in our demo page to show these live examples.

@dave-watts Have you tried applying some CSS to the element(s)? for example:

<div style="background:#fff;color:#000">三十而立,四十而</div>

Other than that, check the html2canvas help since it allows passing a background option iirc.

dave-watts commented 10 years ago

@diegocr just to let people know, doing the style modification solves the problem

april commented 10 years ago

So, has anybody figured out a good, reasonably simple way to make Unicode text appear properly in a PDF? addHTML (at least for me) has been very finicky and doesn't want to work with iframes. I can render onto a canvas and then render onto the PDF, but the text looks pretty fuzzy and I have to redo all my alignments.

Right now, I have a site: http://www.twoevils.org/html/mtg/decklist/

That more-or-less just takes typed in fields and puts them into a nice PDF, with preview. Is there really no way to pass text into addText(), keeping existing coordinates, while maintaining what was typed in? Or am I just crazy for not being able to figure it out?

vcasadei commented 10 years ago

For all of you having problems with special characters, on the webpage (http://mrrio.github.io/jsPDF/#) there is a download link for the project in which this issue is solved (at least for what I tested with Portuguese and German).

bjoerne2 commented 10 years ago

I also tried to fully support unicode so that people around the world with all kind of characters could use the PDF export I'm developing. I had a closer look at pdfmake which supports embedding of (unicode) fonts. And here is the problem: a font with all unicode characters like "Arial Unicode.ttf" has a size of 23 MB! Smaller fonts don't contain all character subsets. I also realized how intelligent text processing software handles missing characters in fonts. When you use a font which doesn't contain the characters of your text, text processing software automatically uses other fonts for those characters. This intelligence has also to be developed, and I guess that's a lot of work, also for pdfmake.

Pomax commented 10 years ago

That's not an actually real problem for properly built PDF exporters/builders, because you subset the font before you add it to the PDF file. When you compile the PDF, you check which codepoints are actually used for each font, build a new copy of that font with a much smaller cmap capturing only the used points (doing the same for things like GPOS/GSUB if your PDF builder is excellent) and then you bundle the font with all the unused codepoints thrown away, and the glyph outlines pruned to only those actually needed. You don't include the full font with the PDF, that would be madness =)

bjoerne2 commented 10 years ago

You're right. I had in mind that a 32 MB file has to be loaded over the internet. I use jsPDF in the browser.

dirkooms commented 10 years ago

@vcasadei : are you referring to the addHTML approach or to something else?

i only need french chars in my pdf, but i've been struggling with this for hours. I tried several suggestions:

anyone a solution and precise instructions for this limited usecase?

vcasadei commented 10 years ago

@dirkooms I was having these problems with Portuguese, as we use almost the same special characters. I tried several things until I reached the site https://parall.ax/products/jspdf and there I downloaded the jspdf plugin and the special characters worked.

You need to fill a small form to download, but after that, everything works fine. I don't know why or what the problem with the github version is, but this other project worked for me.

dave-watts commented 10 years ago

The European special characters work fine with jspdf, however there are problems with character sets such as Japanese

dirkooms commented 10 years ago

thanks for stating this clear. it works indeed and in my case it went wrong when i was writing the pdf document to the filesystem (using the phonegap file api). the writing was fixed by doing the following:

    var pdfOutput = pdfDoc.output();

    var pdfOutput2 = new Uint8Array(new ArrayBuffer(pdfOutput.length));
    for ( var i = 0; i < pdfOutput.length; i++) {
        pdfOutput2[i] = pdfOutput.charCodeAt(i);
    }

    writer.write(pdfOutput2.buffer);
zek commented 10 years ago

Also for turkish characters like ığşöç. It is really important to support the these turkish characters

erdogankaya commented 9 years ago

fontmetrics how we can do this code ?? for our fonts 'Helvetica-Oblique': uncompress("{'widths'{k3p2q4mcx1w201n3r201o6o201s1q201t1q201u1q201w2l201x2l201y2l2k1w2l1w202m2n2n3r2o3r2p5t202q6o2r1n2s2l2t2l2u2r2v3u2w1w2x2l2y1w2z1w3k3r3l3r3m3r3n3r3o3r3p3r3q3r3r3r3s3r203t2l203u2l3v1w3w3u3x3u3y3u3z3r4k6p4l4m4m4m4n4s4o4s4p4m4q3x4r4y4s4s4t1w4u3m4v4m4w3r4x5n4y4s4z4y5k4m5l4y5m4s5n4m5o3x5p4s5q4m5r5y5s4m5t4m5u3x5v1w5w1w5x1w5y2z5z3r6k2l6l3r6m3r6n3m6o3r6p3r6q1w6r3r6s3r6t1q6u1q6v3m6w1q6x5n6y3r6z3r7k3r7l3r7m2l7n3m7o1w7p3r7q3m7r4s7s3m7t3m7u3m7v2l7w1u7x2l7y3u202l3rcl4mal2lam3ran3rao3rap3rar3ras2lat4tau2pav3raw3uay4taz2lbk2sbl3u'fof'6obo2lbp3rbr1wbs2lbu2obv3rbz3xck4m202k3rcm4mcn4mco4mcp4mcq6ocr4scs4mct4mcu4mcv4mcw1w2m2ncy1wcz1wdl4sdm4ydn4ydo4ydp4ydq4yds4ydt4sdu4sdv4sdw4sdz3xek3rel3rem3ren3reo3rep3req5ter3mes3ret3reu3rev3rew1wex1wey1wez1wfl3rfm3rfn3rfo3rfp3rfq3rfr3ufs3xft3rfu3rfv3rfw3rfz3m203k6o212m6o2dw2l2cq2l3t3r3u1w17s4m19m3r}'kerning'{5q{4wv}cl{4qs5kw5ow5qs17sv5tv}201t{2wu4w1k2yu}201x{2wu4wy2yu}17s{2ktclucmucnu4otcpu4lu4wycoucku}2w{7qs4qz5k1m17sy5ow5qx5rsfsu5ty7tufzu}2x{17sy5ty5oy5qs}2y{7qs4qz5k1m17sy5ow5qx5rsfsu5ty7tufzu}'fof'-6o7p{17sv5tv5ow}ck{4qs5kw5ow5qs17sv5tv}4l{4qs5kw5ow5qs17sv5tv}cm{4qs5kw5ow5qs17sv5tv}cn{4qs5kw5ow5qs17sv5tv}co{4qs5kw5ow5qs17sv5tv}cp{4qs5kw5ow5qs17sv5tv}6l{17sy5ty5ow}do{17st5tt}4z{17st5tt}7s{fst}dm{17st5tt}dn{17st5tt}5o{ckwclwcmwcnwcowcpw4lw4wv}dp{17st5tt}dq{17st5tt}7t{5ow}ds{17st5tt}5t{2ktclucmucnu4otcpu4lu4wycoucku}fu{17sv5tv5ow}6p{17sy5ty5ow5qs}ek{17sy5ty5ow}el{17sy5ty5ow}em{17sy5ty5ow}en{5ty}eo{17sy5ty5ow}ep{17sy5ty5ow}es{17sy5ty5qs}et{17sy5ty5ow5qs}eu{17sy5ty5ow5qs}ev{17sy5ty5ow5qs}6z{17sy5ty5ow5qs}fm{17sy5ty5ow5qs}fn{17sy5ty5ow5qs}fo{17sy5ty5ow5qs}fp{17sy5ty5qs}fq{17sy5ty5ow5qs}7r{5ow}fs{17sy5ty5ow5qs}ft{17sv5tv5ow}7m{5ow}fv{17sv5tv5ow}fw{17sv5tv5ow}}}"

vtecman commented 9 years ago

hi, so has another else actually got those characters outside the 255 range to work? I am trying to get the character "ē" to show up in the PDF, but it doesn't work, the decimal code for it is 275.. thx

JohnnieFucker commented 9 years ago

addHtml works for me with chinese,but as @D3CK3R said,the text was unsharp.

vtecman commented 9 years ago

thanks.. but addhtml is using canvas which is not an option for me, had gone with pdfmake now... ...

justein commented 9 years ago

still be struggled by this bug..........would anyone has good ideas?

donatsu commented 9 years ago

Any ideas for maybe bullet points? Importing text from an excel file, which have bullet points. However that does not display correctly on jspdf output.

justein commented 9 years ago

Finally I give up this solution and then implements it by this way: generate table in backend by iText and capture the graph use html2png then upload them onto web server and wrap the graph with the table list ,and it comes to and pdf file .