diegomura / react-pdf

📄 Create PDF files using React
https://react-pdf.org
MIT License
14.73k stars 1.17k forks source link

Font Fallback/Multilanguage support #933

Closed oowowaee closed 7 months ago

oowowaee commented 4 years ago

I know when looking at github issues over the last few days, I saw a thread somewhere that linked to https://github.com/diegomura/react-pdf/blob/1c0bcba15e1cddda6ae9adbba4bedf680e98c9fd/src/layout/fontSubstitution/index.js#L11 in the context of showing font fallbacks.

I have a use case where some of the output could be in one of many languages, and obviously this is causing an issue, because the font cannot register those glyphs. On the web, we don't have this issue, because the fonts fall back. How are you supposed to be able to specify font fallbacks, and more specifically, how can we support text in a language that isn't known at style time? (Ie, I can't just do styles.korean and set a Korean font-family, as the text could be in Arabic, Chinese etc).

chimanSeyedhashemi commented 4 years ago

Hello, I have problem with persian Font in react pdf, it show the last line in the first line. how i can solve this problem? b39af48d-873d-4433-89a7-b5f5a51bae78.pdf

oowowaee commented 4 years ago

@chimanSeyedhashemi React-PDF doesn't support RTL languages. My use case luckily is only names in Arabic.

canturan commented 3 years ago

I also find the multilanguage support pretty important. For example, if you have some chinese users, they are writing mostly in english with some chinese words in between. It is impossible to find a right font for both and to render a pdf document, which supports both languages.

oowowaee commented 3 years ago

@canturan I know when I was researching this it seemed like there were some threads where people suggested working Chinese fonts (and I know we ended up having Chinese support in our PDF), and I thought the default is Helvetica(?)...so I think if it's only Chinese/English you want that might be doable, that didn't work for you?

I was able to get multilanguage support in the end coming up with a really hacky workaround where I overrode the font-fallback logic, but never got it working with loading the fonts on the fly. It seems like needing to be able to fallback to nonEnglish language fonts, or to multiple fonts is something a bunch of people need!

SashaShostyr commented 3 years ago

@diegomura Is there any news about this feature or plans to implement it?:bow:

diegomura commented 3 years ago

Not yet. It's definitely something I want to support but couldn't allocate time yet to do it. Happy to guide someone who wants to do it also

GianBe commented 3 years ago

@diegomura I will gladly help in the implementation of this feature. Correct me if I'm wrong, as an overly simplified summary, first we need to allow mutliple font families (an array) to be added to the components and then check with the fallback which one of them has the gliph to render. Since there is no single font that has support for all the languages the last fallback can be "Helvetica".

MassoudSharifi commented 3 years ago

@oowowaee is there any plan to support RTL languages? because we are waiting for that feature

diegomura commented 3 years ago

It is, and I been working on this and have some progress made. Can't say a timeline though since I'm very busy lately. Sorry

jsantanders commented 2 years ago

Hello, I can help with this one, if support is still required to complete this feature.

Simon-44 commented 2 years ago

Hello, just adding my two cents. I also have this problem of users writing things in their own language within the app (mix of English and Chinese mainly), and these characters not being displayed.

What I've done is detecting the language of the text to print, based on the unicode characters, and selecting the style that uses the right font, but it's far from perfect, for two main reasons:

jeetiss commented 2 years ago

Hi folks,

here is the PR that solves your problem

Only one problem it has written for the first version, but you can try to migrate the code to v2, have a look

oowowaee commented 2 years ago

@jeetiss Thanks, will check it out later. I remember looking at that pr when I opened the issue, and it was not working for me, so I had to write my own solution.

sadam-hussien commented 2 years ago

Hello, I have problem with persian Font in react pdf, it show the last line in the first line. how i can solve this problem? b39af48d-873d-4433-89a7-b5f5a51bae78.pdf

Are you get on solution about this problem?

pklepa commented 2 years ago

Any news on this? Seems like the PR mentioned by @jeetiss was never merged in

ghost commented 2 years ago

@diegomura I think is kind of important issue as there are no fonts support all setup of languages.

SachinRupani commented 1 year ago

Any update on support of multiple font files?

kldoon commented 11 months ago

Still no updates on this? It is really needed

kanapka94 commented 9 months ago

Yes, I also need multilingual support

santialbo commented 8 months ago

I want to share with you how we "solved" this issue:

We use all the different Noto Sans fonts and some unicode ranges to detect the script being used. Then we apply the corresponding fontFamily

import { Text } from "@react-pdf/renderer";
import { maxBy, pipe } from "remeda";

const RANGES = {
  SC: /[\u2E80-\u2E99\u2E9B-\u2EF3\u2F00-\u2FD5\u3005\u3007\u3021-\u3029\u3038-\u303B\u3400-\u4DBF\u4E00-\u9FFF\uF900-\uFA6D\uFA70-\uFAD9]|\uD81B[\uDFE2\uDFE3\uDFF0\uDFF1]|[\uD840-\uD868\uD86A-\uD86C\uD86F-\uD872\uD874-\uD879\uD880-\uD883\uD885-\uD887][\uDC00-\uDFFF]|\uD869[\uDC00-\uDEDF\uDF00-\uDFFF]|\uD86D[\uDC00-\uDF39\uDF40-\uDFFF]|\uD86E[\uDC00-\uDC1D\uDC20-\uDFFF]|\uD873[\uDC00-\uDEA1\uDEB0-\uDFFF]|\uD87A[\uDC00-\uDFE0]|\uD87E[\uDC00-\uDE1D]|\uD884[\uDC00-\uDF4A\uDF50-\uDFFF]|\uD888[\uDC00-\uDFAF]/g,
  Latin:
    /[A-Za-z\u00AA\u00BA\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02B8\u02E0-\u02E4\u1D00-\u1D25\u1D2C-\u1D5C\u1D62-\u1D65\u1D6B-\u1D77\u1D79-\u1DBE\u1E00-\u1EFF\u2071\u207F\u2090-\u209C\u212A\u212B\u2132\u214E\u2160-\u2188\u2C60-\u2C7F\uA722-\uA787\uA78B-\uA7CA\uA7D0\uA7D1\uA7D3\uA7D5-\uA7D9\uA7F2-\uA7FF\uAB30-\uAB5A\uAB5C-\uAB64\uAB66-\uAB69\uFB00-\uFB06\uFF21-\uFF3A\uFF41-\uFF5A]|\uD801[\uDF80-\uDF85\uDF87-\uDFB0\uDFB2-\uDFBA]|\uD837[\uDF00-\uDF1E\uDF25-\uDF2A]/g,
  Cyrillic:
    /[\u0400-\u0484\u0487-\u052F\u1C80-\u1C88\u1D2B\u1D78\u2DE0-\u2DFF\uA640-\uA69F\uFE2E\uFE2F]|\uD838[\uDC30-\uDC6D\uDC8F]/g,
  Arabic:
    /[\u0600-\u0604\u0606-\u060B\u060D-\u061A\u061C-\u061E\u0620-\u063F\u0641-\u064A\u0656-\u066F\u0671-\u06DC\u06DE-\u06FF\u0750-\u077F\u0870-\u088E\u0890\u0891\u0898-\u08E1\u08E3-\u08FF\uFB50-\uFBC2\uFBD3-\uFD3D\uFD40-\uFD8F\uFD92-\uFDC7\uFDCF\uFDF0-\uFDFF\uFE70-\uFE74\uFE76-\uFEFC]|\uD803[\uDE60-\uDE7E\uDEFD-\uDEFF]|\uD83B[\uDE00-\uDE03\uDE05-\uDE1F\uDE21\uDE22\uDE24\uDE27\uDE29-\uDE32\uDE34-\uDE37\uDE39\uDE3B\uDE42\uDE47\uDE49\uDE4B\uDE4D-\uDE4F\uDE51\uDE52\uDE54\uDE57\uDE59\uDE5B\uDE5D\uDE5F\uDE61\uDE62\uDE64\uDE67-\uDE6A\uDE6C-\uDE72\uDE74-\uDE77\uDE79-\uDE7C\uDE7E\uDE80-\uDE89\uDE8B-\uDE9B\uDEA1-\uDEA3\uDEA5-\uDEA9\uDEAB-\uDEBB\uDEF0\uDEF1]/g,
  Bengali:
    /[\u0980-\u0983\u0985-\u098C\u098F\u0990\u0993-\u09A8\u09AA-\u09B0\u09B2\u09B6-\u09B9\u09BC-\u09C4\u09C7\u09C8\u09CB-\u09CE\u09D7\u09DC\u09DD\u09DF-\u09E3\u09E6-\u09FE]/g,
  Devanagari: /[\u0900-\u0950\u0955-\u0963\u0966-\u097F\uA8E0-\uA8FF]|\uD806[\uDF00-\uDF09]/g,
  JP: /[\u3041-\u3096\u309D-\u309F]|\uD82C[\uDC01-\uDD1F\uDD32\uDD50-\uDD52]|\uD83C\uDE00|[\u30A1-\u30FA\u30FD-\u30FF\u31F0-\u31FF\u32D0-\u32FE\u3300-\u3357\uFF66-\uFF6F\uFF71-\uFF9D]|\uD82B[\uDFF0-\uDFF3\uDFF5-\uDFFB\uDFFD\uDFFE]|\uD82C[\uDC00\uDD20-\uDD22\uDD55\uDD64-\uDD67]|[\u3400-\u4DB5\u4E00-\u9FAF]/g,
  KR: /[\u1100-\u11FF\u302E\u302F\u3131-\u318E\u3200-\u321E\u3260-\u327E\uA960-\uA97C\uAC00-\uD7A3\uD7B0-\uD7C6\uD7CB-\uD7FB\uFFA0-\uFFBE\uFFC2-\uFFC7\uFFCA-\uFFCF\uFFD2-\uFFD7\uFFDA-\uFFDC]/g,
  Tamil:
    /[\u0B82\u0B83\u0B85-\u0B8A\u0B8E-\u0B90\u0B92-\u0B95\u0B99\u0B9A\u0B9C\u0B9E\u0B9F\u0BA3\u0BA4\u0BA8-\u0BAA\u0BAE-\u0BB9\u0BBE-\u0BC2\u0BC6-\u0BC8\u0BCA-\u0BCD\u0BD0\u0BD7\u0BE6-\u0BFA]|\uD807[\uDFC0-\uDFF1\uDFFF]/g,
  Kannada:
    /[\u0C80-\u0C8C\u0C8E-\u0C90\u0C92-\u0CA8\u0CAA-\u0CB3\u0CB5-\u0CB9\u0CBC-\u0CC4\u0CC6-\u0CC8\u0CCA-\u0CCD\u0CD5\u0CD6\u0CDD\u0CDE\u0CE0-\u0CE3\u0CE6-\u0CEF\u0CF1-\u0CF3]/g,
  Myanmar: /[\u1000-\u109F\uA9E0-\uA9FE\uAA60-\uAA7F]/g,
  Ethiopic:
    /[\u1200-\u1248\u124A-\u124D\u1250-\u1256\u1258\u125A-\u125D\u1260-\u1288\u128A-\u128D\u1290-\u12B0\u12B2-\u12B5\u12B8-\u12BE\u12C0\u12C2-\u12C5\u12C8-\u12D6\u12D8-\u1310\u1312-\u1315\u1318-\u135A\u135D-\u137C\u1380-\u1399\u2D80-\u2D96\u2DA0-\u2DA6\u2DA8-\u2DAE\u2DB0-\u2DB6\u2DB8-\u2DBE\u2DC0-\u2DC6\u2DC8-\u2DCE\u2DD0-\u2DD6\u2DD8-\u2DDE\uAB01-\uAB06\uAB09-\uAB0E\uAB11-\uAB16\uAB20-\uAB26\uAB28-\uAB2E]|\uD839[\uDFE0-\uDFE6\uDFE8-\uDFEB\uDFED\uDFEE\uDFF0-\uDFFE]/g,
  Thai: /[\u0E01-\u0E3A\u0E40-\u0E5B]/g,
  Greek:
    /[\u0370-\u0373\u0375-\u0377\u037A-\u037D\u037F\u0384\u0386\u0388-\u038A\u038C\u038E-\u03A1\u03A3-\u03E1\u03F0-\u03FF\u1D26-\u1D2A\u1D5D-\u1D61\u1D66-\u1D6A\u1DBF\u1F00-\u1F15\u1F18-\u1F1D\u1F20-\u1F45\u1F48-\u1F4D\u1F50-\u1F57\u1F59\u1F5B\u1F5D\u1F5F-\u1F7D\u1F80-\u1FB4\u1FB6-\u1FC4\u1FC6-\u1FD3\u1FD6-\u1FDB\u1FDD-\u1FEF\u1FF2-\u1FF4\u1FF6-\u1FFE\u2126\uAB65]|\uD800[\uDD40-\uDD8E\uDDA0]|\uD834[\uDE00-\uDE45]/g,
  Khmer: /[\u1780-\u17DD\u17E0-\u17E9\u17F0-\u17F9\u19E0-\u19FF]/g,
  Armenian: /[\u0531-\u0556\u0559-\u058A\u058D-\u058F\uFB13-\uFB17]/g,
  Hebrew:
    /[\u0591-\u05C7\u05D0-\u05EA\u05EF-\u05F4\uFB1D-\uFB36\uFB38-\uFB3C\uFB3E\uFB40\uFB41\uFB43\uFB44\uFB46-\uFB4F]/g,
  Georgian:
    /[\u10A0-\u10C5\u10C7\u10CD\u10D0-\u10FA\u10FC-\u10FF\u1C90-\u1CBA\u1CBD-\u1CBF\u2D00-\u2D25\u2D27\u2D2D]/g,
  Lao: /[\u0E81\u0E82\u0E84\u0E86-\u0E8A\u0E8C-\u0EA3\u0EA5\u0EA7-\u0EBD\u0EC0-\u0EC4\u0EC6\u0EC8-\u0ECE\u0ED0-\u0ED9\u0EDC-\u0EDF]/g,
  "Canadian Aboriginal": /[\u1400-\u167F\u18B0-\u18F5]|\uD806[\uDEB0-\uDEBF]/g,
  Sinhala:
    /[\u0D81-\u0D83\u0D85-\u0D96\u0D9A-\u0DB1\u0DB3-\u0DBB\u0DBD\u0DC0-\u0DC6\u0DCA\u0DCF-\u0DD4\u0DD6\u0DD8-\u0DDF\u0DE6-\u0DEF\u0DF2-\u0DF4]|\uD804[\uDDE1-\uDDF4]/g,
  "Ol Chiki": /[\u1C50-\u1C7F]/g,
  Tibetan:
    /[\u0F00-\u0F47\u0F49-\u0F6C\u0F71-\u0F97\u0F99-\u0FBC\u0FBE-\u0FCC\u0FCE-\u0FD4\u0FD9\u0FDA]/g,
  Tifinagh: /[\u2D30-\u2D67\u2D6F\u2D70\u2D7F]/g,
  Yi: /[\uA000-\uA48C\uA490-\uA4C6]/g,
  Syriac: /[\u0700-\u070D\u070F-\u074A\u074D-\u074F\u0860-\u086A]/g,
  Thaana: /[\u0780-\u07B1]/g,
  Vai: /[\uA500-\uA62B]/g,
  Cherokee: /[\u13A0-\u13F5\u13F8-\u13FD\uAB70-\uABBF]/g,
  "Tai Tham": /[\u1A20-\u1A5E\u1A60-\u1A7C\u1A7F-\u1A89\u1A90-\u1A99\u1AA0-\u1AAD]/g,
  "Tai Viet": /[\uAA80-\uAAC2\uAADB-\uAADF]/g,
  Javanese: /[\uA980-\uA9CD\uA9D0-\uA9D9\uA9DE\uA9DF]/g,
  Telugu:
    /[\u0C00-\u0C0C\u0C0E-\u0C10\u0C12-\u0C28\u0C2A-\u0C39\u0C3C-\u0C44\u0C46-\u0C48\u0C4A-\u0C4D\u0C55\u0C56\u0C58-\u0C5A\u0C5D\u0C60-\u0C63\u0C66-\u0C6F\u0C77-\u0C7F]/g,
  Gujarati:
    /[\u0A81-\u0A83\u0A85-\u0A8D\u0A8F-\u0A91\u0A93-\u0AA8\u0AAA-\u0AB0\u0AB2\u0AB3\u0AB5-\u0AB9\u0ABC-\u0AC5\u0AC7-\u0AC9\u0ACB-\u0ACD\u0AD0\u0AE0-\u0AE3\u0AE6-\u0AF1\u0AF9-\u0AFF]/g,
  Malayalam:
    /[\u0D00-\u0D0C\u0D0E-\u0D10\u0D12-\u0D44\u0D46-\u0D48\u0D4A-\u0D4F\u0D54-\u0D63\u0D66-\u0D7F]/g,
  Gurmukhi:
    /[\u0A01-\u0A03\u0A05-\u0A0A\u0A0F\u0A10\u0A13-\u0A28\u0A2A-\u0A30\u0A32\u0A33\u0A35\u0A36\u0A38\u0A39\u0A3C\u0A3E-\u0A42\u0A47\u0A48\u0A4B-\u0A4D\u0A51\u0A59-\u0A5C\u0A5E\u0A66-\u0A76]/g,
};

export function WithInternationalFontFamily({ children }: { children: string }) {
  const [script] = maxBy(Object.entries(RANGES), ([_, test]) => children.match(test)?.length ?? 0)!;
  const fontFamily = ["Latin", "Cyrillic", "Greek"].includes(script)
    ? "Noto Sans"
    : `Noto Sans ${script}`;
  return <Text style={{ fontFamily }}>{children}</Text>;
}
bdamor5 commented 2 months ago

@santialbo just to expand on your point, created a demo app that does the same - Demo

oowowaee commented 2 months ago

@diegomura Can you confirm this is actually resolved? It seems like the closing MR is primarily concerned with adding RTL support which is not the same as this issue. I only skimmed, but nothing that would allow multiple font-fallbacks jumped out at me.