DavBfr / dart_pdf

Pdf creation module for dart/flutter
https://pub.dev/packages/pdf
Apache License 2.0
1.42k stars 632 forks source link

RTL and Script Unicode Font (ttf), and ligatures not supported #198

Closed sh0umik closed 1 year ago

sh0umik commented 4 years ago

Describe the bug Bangla Unicode Fonts gets Broken after generation of the PDF. The Unicode font works well in web. But its got broken when

I made a repository to demonstrate the problem. ( font included ). I have tried with 12 different Bangla Unicode font. Same result for all. Just run the program and you will see that the String in Text widget didnt get rendered properly.

https://github.com/sh0umik/flutter_pdf_test.git

Expected behaviour Expected behaviour is exactly like the test in Text widget

Screenshots Got this (broken) image

Expected this

image

Flutter Doctor Paste the output of running flutter doctor -v here.

[✓] Flutter (Channel beta, v1.12.13+hotfix.6, on Mac OS X 10.14.6 18G103, locale en-BD)
    • Flutter version 1.12.13+hotfix.6 at /Users/diablo/flutter
    • Framework revision 18cd7a3601 (3 weeks ago), 2019-12-11 06:35:39 -0800
    • Engine revision 2994f7e1e6
    • Dart version 2.7.0

[✓] Android toolchain - develop for Android devices (Android SDK version 29.0.2)
    • Android SDK at /Users/diablo/Library/Android/sdk
    • Android NDK location not configured (optional; useful for native profiling support)
    • Platform android-29, build-tools 29.0.2
    • Java binary at: /Users/diablo/Library/Application Support/JetBrains/Toolbox/apps/AndroidStudio/ch-0/191.6010548/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/java
    • Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)
    • All Android licenses accepted.

[✓] Xcode - develop for iOS and macOS (Xcode 11.1)
    • Xcode at /Applications/Xcode.app/Contents/Developer
    • Xcode 11.1, Build version 11A1027
    • CocoaPods version 1.8.1

[✓] Chrome - develop for the web
    • Chrome at /Applications/Google Chrome.app/Contents/MacOS/Google Chrome

[✓] Android Studio (version 3.5)
    • Android Studio at /Users/diablo/Library/Application Support/JetBrains/Toolbox/apps/AndroidStudio/ch-0/191.6010548/Android Studio.app/Contents
    • Flutter plugin version 42.1.1
    • Dart plugin version 191.8593
    • Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)

[!] IntelliJ IDEA Ultimate Edition (version 2019.3)
    • IntelliJ at /Users/diablo/Applications/JetBrains Toolbox/IntelliJ IDEA Ultimate.app
    ✗ Flutter plugin not installed; this adds Flutter specific functionality.
    ✗ Dart plugin not installed; this adds Dart specific functionality.
    • For information about installing plugins, see
      https://flutter.dev/intellij-setup/#installing-the-plugins

[✓] VS Code (version 1.41.1)
    • VS Code at /Applications/Visual Studio Code.app/Contents
    • Flutter extension version 3.7.1

[✓] Connected device (3 available)
    • ONEPLUS A6010 • 192.168.0.100:5555 • android-arm64  • Android 10 (API 29)
    • Chrome        • chrome             • web-javascript • Google Chrome 79.0.3945.88
    • Web Server    • web-server         • web-javascript • Flutter Tools

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Many thing on my flutter web app depends on this plugin. Need to find a solution ASAP. Also i am converting the pdf.save() into a Uint8List so that it could be sent through firebase. Could this be a problem ? List to Uint8List conversation ?

DavBfr commented 4 years ago

Sorry, I don't really see what's wrong in the rendering. I don't read Bengali and can't really spot the missing features. or is it just the bold face?

sh0umik commented 4 years ago

image

There is ! Look at the picture. The broken text are joined text its like two latter joined and formed a single one. The formed/joined text is broken.

Flutter widget Text renders it correctly in web, andorid, ios .. but yours pdf lib widget Text does not render it correctly.

sh0umik commented 4 years ago

Similar Problem is mentioned here .. But its for PHP pdf generation lib.

https://stackoverflow.com/questions/32421564/bangla-unicode-font-not-rendering-correctly-in-tcpdf

DavBfr commented 4 years ago

Ok, I guess it's about GSUB support for OTF fonts: https://docs.microsoft.com/en-us/typography/opentype/spec/gsub

sh0umik commented 4 years ago

I guess so.

https://docs.microsoft.com/en-us/typography/script-development/bengali

image

This is the problem I am getting whats left side fo -> but i need the output which is at the right side of the ->

sh0umik commented 4 years ago

@DavBfr i tried the arabic-fonts branch. It didnt work for me :(

DavBfr commented 4 years ago

No, the arabic-fonts branch uses some specific character replacement for Arabic only. GSUB is not yet implemented.

If you're willing to implement it in ttf_parser.dart I'll be happy to help.

Try something like this:

diff --git a/pdf/lib/src/ttf_parser.dart b/pdf/lib/src/ttf_parser.dart
index 3108a48..fc7cce0 100644
--- a/pdf/lib/src/ttf_parser.dart
+++ b/pdf/lib/src/ttf_parser.dart
@@ -54,6 +54,9 @@ class TtfParser {
     _parseCMap();
     _parseIndexes();
     _parseGlyphs();
+    if (tableOffsets.containsKey(gsub_table)) {
+      _parseGsub();
+    }
   }

   static const String head_table = 'head';
@@ -64,6 +67,7 @@ class TtfParser {
   static const String maxp_table = 'maxp';
   static const String loca_table = 'loca';
   static const String glyf_table = 'glyf';
+  static const String gsub_table = 'GSUB';

   final UnmodifiableByteDataView bytes;
   final Map<String, int> tableOffsets = <String, int>{};
@@ -368,4 +372,20 @@ class TtfParser {
       components,
     );
   }
+
+  void _parseGsub() {
+    print(fontName);
+    print(tableOffsets);
+
+    final int basePosition = tableOffsets[gsub_table];
+    print('GSUB Version: ${bytes.getUint32(basePosition).toRadixString(16)}');
+    final int scriptListOffset =
+        bytes.getUint16(basePosition + 4) + basePosition;
+    final int featureListOffset =
+        bytes.getUint16(basePosition + 6) + basePosition;
+    final int lookupListOffset =
+        bytes.getUint16(basePosition + 8) + basePosition;
+    print(
+        'GSUB Offsets: $scriptListOffset $featureListOffset $lookupListOffset');
+  }
 }

And see if your font contains any useful information.

sh0umik commented 4 years ago

I am ready to implement it. Just tell me what to write ? It would be better if you could show me for just one character as example and where can i find the position to complete the rest then i can complete it. I am totally new to this.

sh0umik commented 4 years ago

@DavBfr this is what i found running the code above. Whats next ?

I/flutter (10254): SiyamRupali
I/flutter (10254): {EBDT: 393284, EBLC: 393460, GDEF: 399344, GPOS: 398924, GSUB: 393740, LTSH: 3756, OS/2: 504, VDMX: 4552, cmap: 399388, cvt : 25408, fpgm: 24280, gasp: 393268, glyf: 25416, hdmx: 6056, head: 380, hhea: 436, hmtx: 600, kern: 381088, loca: 377928, maxp: 472, name: 381112, post: 385900, prep: 25396}
I/flutter (10254): GSUB Version: 10000
I/flutter (10254): GSUB Offsets: 393750 393832 394122
sh0umik commented 4 years ago

Using this site called FontDrop

I can see the following...

image

Could it help implement the parser? If so then can you just guide me how to implement it ?

DavBfr commented 4 years ago

This site is really useful, thanks!

So the first step is to parse this GSUB table to find all the lookups.

Then I think when you have this:

{
  "ligGlyph": 237,
  "components": [
    102,
    86
  ]
}

if we want to draw the glyphs 102 and 86 next to eachother, we replace with 237

DavBfr commented 4 years ago

I think the tables to read for your issue is the Ligature Substitution Subtable:

https://docs.microsoft.com/en-us/typography/opentype/spec/gsub#lookuptype-4-ligature-substitution-subtable

Only one format, that should not be too difficult.

sh0umik commented 4 years ago

Thank you for your ans. I am starting to understand how this works but I am totally new to this and have no idea about the variables in the parser. Can you just write a function or a bloc of code just to parse this as an example

{
  "ligGlyph": 237,
  "components": [
    102,
    86
  ]
}

maybe after that i can follow that code and implement the rest ?

DavBfr commented 4 years ago

If you look at _parseGlyphs() in the same file, it will look the same: read the binary file content using bytes.getInt16(offset); or getInt32 and friends. Just follow the MS documentation to know what to read.

In the function parseGsub() your start offset is basePosition which gives you access to the offsets of the next tables : scriptList, featureList, lookupList.

The first to parse is the featureList described here: https://docs.microsoft.com/en-us/typography/opentype/spec/chapter2#flTbl

and get all the features with the right tag, maybe https://docs.microsoft.com/en-us/typography/opentype/spec/features_ae#blws

Then the lookupList will give the right glyphs to replace.

sh0umik commented 4 years ago

Is there any chance you can add this feature soon ? I badly need this :( . Since i have no idea on the variable used in the parsing and how it works i think i wont be able to help much.

Eagerly waiting for your answer on this. Please, can find some of your spare time to add this feature ? Or Milestone ?

sh0umik commented 4 years ago

@DavBfr what functions to use for find and replace the glyphs

if we want to draw the glyphs 102 and 86 next to eachother, we replace with 237

I could not find any substitution function in ttf_parser ? is it in charToGlyphIndexMap map ?

Peet-A commented 4 years ago

how can i help you guys ?

DavBfr commented 4 years ago

I think all the needed info is on this ticket. I don't have time to work on this now. Unless some of you are willing to pay for the feature. Then I could take a look at crowdfunding. This also includes Arabic and other languages.

DavBfr commented 4 years ago

Please use this link: https://www.bountysource.com/issues/86211023-bangla-unicode-font-ttf-gets-broken-during-pdf-generation

ashutosh1211 commented 4 years ago

@DavBfr Hi, I'm able to prepare featureList , what to do next? should I prepare lookup table?

DavBfr commented 4 years ago

Yes, the lookup table is next. I have some code already for that. Let me push it. to the branch arabic-fonts

ashutosh1211 commented 4 years ago

Sure, :), I'm stuck at subtable parsing in lookup table.

shofizone commented 4 years ago

I am also facing same problem while generating PDF. Bangla Unicode Characters are broken.

pluzmedia commented 3 years ago

Is there any chance you can add this feature soon ? I badly need this :( . Since i have no idea on the variable used in the parsing and how it works i think i wont be able to help much.

Eagerly waiting for your answer on this. Please, can find some of your spare time to add this feature ? Or Milestone ?

Did you found any solution?

pratikmmohite commented 3 years ago

Hi @pluzmedia, JFYI: I achived the pdf for complex script layout by following steps 1) Created html template layout using mustache5 standards 2) Generated html filled with data using Mustache package 3) Used the following code from printing package which directly open print dialog in supported platform:

await Printing.layoutPdf(
    onLayout: (PdfPageFormat format) async => await Printing.convertHtml(
          format: format,
          html: '<html><body><p>Hello!</p></body></html>', // pass generated html here
        ));
pluzmedia commented 3 years ago

Hi @pluzmedia, JFYI: I achived the pdf for complex script layout by following steps

  1. Created html template layout using mustache5 standards
  2. Generated html filled with data using Mustache package
  3. Used the following code from printing package which directly open print dialog in supported platform:
await Printing.layoutPdf(
    onLayout: (PdfPageFormat format) async => await Printing.convertHtml(
          format: format,
          html: '<html><body><p>Hello!</p></body></html>', // pass generated html here
        ));

Thanks you, I am doing the same.

Ya-seeen commented 3 years ago

same problem here. Any update on this?

DavBfr commented 3 years ago

@Ya-seeen no, I don't think anyone is working on it. You can look at the Arabic shaper class and implement the same, that would be much appreciated!

Ya-seeen commented 3 years ago

Could you please provide me a link to this?

DavBfr commented 3 years ago

Yes, it's this file: https://github.com/DavBfr/dart_pdf/blob/master/pdf/lib/src/pdf/arabic.dart You will have some replacements on Unicode codepoints to do, but we'll have to find a way to enable it, as this PdfArabic class is used only for right-to-left text direction.

Another more generic way is to implement proper font ligature parsing in https://github.com/DavBfr/dart_pdf/blob/master/pdf/lib/src/pdf/ttf_parser.dart. Ashutosh started this here: https://github.com/ashutosh1211/dart_pdf/commits/master

Ya-seeen commented 3 years ago

Thanks a lot. I will try this.

ganeshchenniah commented 3 years ago

I have same issue with kannada font , I have taken BalooTamma2-Regular.ttf from google . Some characters are not properly renderning , have you found any solution for this ?

Joseph-Nathan commented 3 years ago

I have same issue with Arabic font ? any new news !

Joseph-Nathan commented 3 years ago

i try to write Arabic and i got text ( reversed and splited ) .

Expected this جوزيف

Got this Screenshot_87

when i reversed it got this Screenshot_88

any help please !!

DavBfr commented 3 years ago

@Joseph-Nathan Use direction: pw.TextDirection.rtl

Joseph-Nathan commented 3 years ago

@DavBfr i use textDirection: pw.TextDirection.rtl and i got some different char Screenshot_89

DavBfr commented 3 years ago

I can't help more, I don't know how to read Arabic, sorry.

GuvanchBayryyyev commented 3 years ago

@DavBfr for this issue, which solution above is working? Thanks

DavBfr commented 3 years ago

@GuvanchBayryyyev no RTL writing works except Arabic. It's not implemented.

GuvanchBayryyyev commented 3 years ago

@GuvanchBayryyyev no RTL writing works except Arabic. It's not implemented.

I think, my issue is not about Right-to-Left, it is about thai font

Do I need extra configuration to solve the problem?

ganeshchenniah commented 3 years ago

Did u find any solution for this?

ganeshchenniah commented 3 years ago
Printing

this is for whole page conversion, What is only few sentence needs to be converted , rest remain same

GuvanchBayryyyev commented 3 years ago

@ganeshchenniah I did not find any solution yet. What do you mean by Printing by whole page conversion

GuvanchBayryyyev commented 3 years ago

@PratikMMohite Can you provide example source code please? I am also trying to implement as you did

pratikmmohite commented 3 years ago

@PratikMMohite Can you provide example source code please? I am also trying to implement as you did @GuvanchBayryyyev Check out: https://gist.github.com/PratikMMohite/de8fe11f55423cf5c6c312a7dd7275fd

GuvanchBayryyyev commented 3 years ago

@PratikMMohite Thanks a lot!

However, I am getting this error: Unimplemented handling of missing static target

when I run this code, from mustache package

var template = Template(
    source,
    lenient: true,
    name: 'source.html',
    htmlEscapeValues: true,
 );
Joseph-Nathan commented 3 years ago

i try to write Arabic and i got text ( reversed and splited ) .

Expected this جوزيف

Got this Screenshot_87

when i reversed it got this Screenshot_88

any help please !!

any one can help me . where i should edit two work that .

vshanthamoorthi commented 3 years ago

I have same issue with kannada font , I have taken BalooTamma2-Regular.ttf from google . Some characters are not properly renderning , have you found any solution for this ?

Issue is seen with Tamil font also. I tried multiple tamil fonts from google fonts and all of them are behaving same way.

ashutosh1211 commented 3 years ago

I have same issue with kannada font , I have taken BalooTamma2-Regular.ttf from google . Some characters are not properly renderning , have you found any solution for this ?

Issue is seen with Tamil font also. I tried multiple tamil fonts from google fonts and all of them are behaving same way.

Hello @vshanthamoorthi , I tried but did not get time to complete, it at stage of substituting glyphs. I have prepared necessary tables.

more in this branch https://github.com/ashutosh1211/dart_pdf

vinayreddyking commented 3 years ago

I have same issue with Telugu font also. Is there is a way to get that render correct? Any one here finds any alternative??