allcolor / YaHP-Converter

YaHP is a Java library that allows you to convert an HTML document into a PDF document.
GNU Lesser General Public License v2.1
56 stars 23 forks source link

Japanese can't be converted completely in the pdf #45

Closed milkdeliver closed 8 years ago

milkdeliver commented 8 years ago

Hi allcolor,

Japanese can't be converted completely in the pdf. There are cut off in the pdf.The cut off part is out of range.

Do you have any ideas?Really appreciate for your help.

Regards, SS

allcolor commented 8 years ago

Hi,

you have to use a unicode font that has all the glyph you want and embed it.

For that first you must set the property IHtmlToPdfTransformer.FOP_TTF_FONT_PATH to point to a directory containing the desired TTF font files.

Then in your html you have to set the style "font-family" to the font family name like this for example (for Arial Unicode MS):

text

Pay attention that the family name you use must be the exact name reported by the JVM for the TTF file. To know the exact name, load it in java and print it's family name:

InputStream is = new FileInputStream(new File("/path/to/ttf/font.ttf")); Font font = Font.createFont(Font.TRUETYPE_FONT, is); System.out.println(font.getFamily());

=> https://docs.oracle.com/javase/7/docs/api/java/awt/Font.html#getFamily()

Regards, Quentin

2016-08-31 10:11 GMT+02:00 Milkdelivery notifications@github.com:

Hi allcolor,

Japanese can't be converted completely in the pdf. There are cut off in the pdf.The cut off part is out of range.

Do you have any ideas?Really appreciate for your help.

Regards, SS

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/allcolor/YaHP-Converter/issues/45, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJ5ByPjxcpoHlmxjjs7qH7QID7XUH5cks5qlTcwgaJpZM4JxV70 .

All those moments will be lost in time, like tears in rain. (Roy Batty/Rutger Hauer)

milkdeliver commented 8 years ago

Hi,

The font has been already embed into the code.Japanese has been already converted.But it's not my problem.

The problem is that the paragraph is cut off and some words are lost in the pdf.

<span style="font-family: Arial Unicode MS">
   監視およびアラートインフラストラクチャのデータコレクタにより返されるデータの制限はおよびに関しては行うことができません。これらのパラメータを入力パラメータとして定義した場合、入力はデータコレクタで使用されません。
</span>

Completed paragraph: image

Cut off paragraph image

Do you have any ideas?

Regards, SS

allcolor commented 8 years ago

Then you have either to cut yourself using html br... put an inferior font size... ensure spacing between character to let the renderer cut a line... the renderer won't cut in a middle of a "word" (a word is a sequence of character without spacing... even if unfortunately that's not the case in japanese).

Regards, Quentin

2016-08-31 10:55 GMT+02:00 Milkdelivery notifications@github.com:

Hi,

The font has been already embed into the code.Japanese has been already converted.But it's not my problem.

The problem is that the paragraph is cut off and some words are lost in the pdf.

監視およびアラートインフラストラクチャのデータコレクタにより返されるデータの制限はおよびに関しては行うことができません。これらのパラメータを入力パラメータとして定義した場合、入力はデータコレクタで使用されません。

Completed paragraph: [image: image] https://cloud.githubusercontent.com/assets/3108407/18122363/90caa2dc-6f9b-11e6-95b4-8019bdff20e2.png

Cut off paragraph [image: image] https://cloud.githubusercontent.com/assets/3108407/18122343/777de0a0-6f9b-11e6-83ab-870f2c23b963.png

Do you have any ideas?

Regards, SS

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/allcolor/YaHP-Converter/issues/45#issuecomment-243701331, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJ5B7pocxR15Or3dDWfwhtFzL7gy_qSks5qlUF6gaJpZM4JxV70 .

All those moments will be lost in time, like tears in rain. (Roy Batty/Rutger Hauer)

milkdeliver commented 8 years ago

Is there possible to let the renderer cut a line automatically?It's not one line but a whole article.I don't know when to add <br/>.

Regards, SS

allcolor commented 8 years ago

Hi, unfortunately no... and I'm not working on the converter since several years... I'm just answering support questions if any... so as a workaround, you could try to pre-parse the html to add carriage return or spaces in between characters... just look at how many are on a line, and you simply count the characters and when you reach that limit, you insert a space character in between, the renderer then should be able to cut the line.

I'm sorry to not have a better solution.

Regards, Quentin

2016-08-31 11:20 GMT+02:00 Milkdelivery notifications@github.com:

Is there possible to let the renderer cut a line automatically?It's not one line but a whole article.I don't know when to add .

Regards, SS

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/allcolor/YaHP-Converter/issues/45#issuecomment-243707475, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJ5B2KCXdN8kb4QovMq-NnCv5JtIxn3ks5qlUdbgaJpZM4JxV70 .

All those moments will be lost in time, like tears in rain. (Roy Batty/Rutger Hauer)

milkdeliver commented 8 years ago

Hi Quentin,

One is better than none.Really thanks for your help. Inform you if i have another solution.

Regards, SS