TomRoush / PdfBox-Android

The Apache PdfBox project ported to work on Android
Apache License 2.0
1.01k stars 259 forks source link

How can I write Japanese characters? #66

Open Phyambre opened 8 years ago

Phyambre commented 8 years ago

Hi,

I am trying to write Japanese characters, but all the fonts I have tried seem to have some problem. My code looks like this:

content.beginText();
InputStream fontInputStream = context.getAssets().open("ArialMT.ttf");
content.setFont(PDType0Font.load(doc, fontInputStream), 10);
content.newLineAtOffset(CALCULATIONS_HERE_TO_GET_THE_COORDINATES);
content.showText(TEXT_HERE);
content.endText();

Now depending on the font the error is different For example, fonts like DroidSansJapanese I get:

java.io.IOException: The TrueType font does not contain a Unicode cmap
org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmap(TrueTypeFont.java:455)
org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmap(TrueTypeFont.java:414)
org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.<init>(TrueTypeEmbedder.java:60)
org.apache.pdfbox.pdmodel.font.PDCIDFontType2Embedder.<init>(PDCIDFontType2Embedder.java:45)
org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:100)
org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:55)
rlopezga.music.kunkunshieditor.model.facade.action.ValidateFileNameAndExportKunkunshiToPDFAction.generatePDF(ValidateFileNameAndExportKunkunshiToPDFAction.java:256)

For other fonts such as HeiseiMin-W3-Acro I get:

java.io.IOException: loca is mandatory
org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:191)
org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:135)
org.apache.fontbox.ttf.TTFParser.parseEmbedded(TTFParser.java:109)
org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.buildFontFile2(TrueTypeEmbedder.java:74)
org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.<init>(TrueTypeEmbedder.java:56)
org.apache.pdfbox.pdmodel.font.PDCIDFontType2Embedder.<init>(PDCIDFontType2Embedder.java:45)
org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:100)
org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:55)
rlopezga.music.kunkunshieditor.model.facade.action.ValidateFileNameAndExportKunkunshiToPDFAction.generatePDF(ValidateFileNameAndExportKunkunshiToPDFAction.java:256)

For non-Japanese fonts such as ArialMT.ttf I get:

java.lang.IllegalArgumentException: No glyph for U+3042 in font BitstreamVeraSans-Roman
org.apache.pdfbox.pdmodel.font.PDCIDFontType2.encode(PDCIDFontType2.java:350)
org.apache.pdfbox.pdmodel.font.PDType0Font.encode(PDType0Font.java:267)
org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:249)
org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:292)
rlopezga.music.kunkunshieditor.model.facade.action.ValidateFileNameAndExportKunkunshiToPDFAction.generatePDF(ValidateFileNameAndExportKunkunshiToPDFAction.java:258)

Finally, if I try to load a type 1 font embedded in your library such as: content.setFont(PDType1Font.COURIER, 10);

Then I get:

java.lang.IllegalArgumentException: This font type only supports 8-bit code points
org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:281)
org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:249)
org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:292)
rlopezga.music.kunkunshieditor.model.facade.action.ValidateFileNameAndExportKunkunshiToPDFAction.generatePDF(ValidateFileNameAndExportKunkunshiToPDFAction.java:258)

In general, it seems that all otf fonts give the "loca is mandatory" error.

I found some fonts that can do the job, such as TakaoPGothic, VL-Gothic-Regular or MS-Mincho, but all of them are quite heavy (around 5MB), so I would like to find an alternative. In addition I think MS-Mincho is a privative font, so cannot be used without Microsoft's permission. I would like to use DroidSansJapanese, which is around 1.3 MB, but I can't understand what is the problem.

By the way, for loading large fonts such as TakaoPGothic, DroidSansJapanese, etc. I am doing the trick of changing the extension from ttf to mp3 in the assets as mentioned here: http://stackoverflow.com/questions/7503133/japanese-characters-looking-like-chinese-on-android but I don't think that is the problem, as some fonts work. So, what can I do to use a smaller font? Please help.

TomRoush commented 8 years ago

There's been problems with non Latin characters for a while, but I'll look into it. Do you have an example string that you're having trouble writing that I can use for testing? Thanks.

Phyambre commented 8 years ago

Hi, For those fonts I explained above, any string with Japanese characters fails. So I think it is a problem of the metadata of the fonts or something. You can take as an example: 私はたろうです。 As I said, a couple of very heavy fonts such as TakaoPGothic can do the job, but they are big and I am afraid of OutOfMemoryError. Actually at this moment I am programming a kind of music editor and with your framework I can only export the music file to PDF with the music file closed. If it is open, and the editor is showing many visual components, I get the OutOfMemoryError the second or 3rd time I export the file to PDF.

v8sagar commented 5 years ago

@TomRoush have you solved this issue? I am also facing same