Hindi characters appear incorrectly even after adding 'Arial Unicode MS' font when converting from HTML to PDF

What steps will reproduce the problem?
1. Take any HTML which has hindi characters in it. I am attaching an input html 
file.
2. Convert this HTML to PDF using the regular method by also providing font 
using renderer.getFontResolver().addFont() method. I checked it for 'Arial 
Unicode MS' and also using other hindi fonts such as Samyak Devanagari and 
Sarai but the result is same . The fonts are embedded correctly in the PDF 
which I have verified, hindi content is also visible but the words are not 
correct.

What is the expected output? What do you see instead?
I am attaching files for the expected output and the actual output which should 
make things more clear.The expected output pdf has been generated using HTML 
from a tool called pdfcrowd which is doing it correctly.

What version of the product are you using? On what operating system?
product version : Release 8 (R8)
OS : Ubuntu 14.04

Please provide any additional information below.

Thank you for this. It is a wonderful tool but I am really stuck on this part. 
I have tried almost all solutions available on the net and filing an issue here 
as the last resort. The problem is that the fonts do get embedded in the PDF 
that is generated but the content is not displayed correctly. If I copy the 
content generated in the PDF and paste it on browser , everything is displayed 
correctly.

Original issue reported on code.google.com by varun.p....@gmail.com on 2 Apr 2015 at 7:14

Attachments:

balbuenac / flying-saucer

Hindi characters appear incorrectly even after adding 'Arial Unicode MS' font when converting from HTML to PDF #259