oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
72.19k stars 2.58k forks source link

react-pdf font inclusion is broken #8645

Open ttmx opened 5 months ago

ttmx commented 5 months ago

What version of Bun is running?

1.0.25+a8ff7be64

What platform is your computer?

Linux 6.7.0-arch3-1 x86_64 unknown

What steps can reproduce the bug?

Attempt to include custom fonts in a pdf file, using react-pdf.

This example is copy pasted from the issue in react-pdf's repo.

import {
  Font,
  Page,
  Text,
  Document,
  pdf,
  StyleSheet,
  View
} from '@react-pdf/renderer';

import bodoniModa from '@fontsource/bodoni-moda/files/bodoni-moda-latin-400-normal.woff';
import lato from '@fontsource/lato/files/lato-latin-400-normal.woff';
import sourceSansPro from '@fontsource/source-sans-pro/files/source-sans-pro-cyrillic-400-normal.woff';

const OUT_DIR = `${import.meta.dir}/../out`;

Font.register({ family: 'Bodoni Moda', src: bodoniModa });
Font.register({ family: 'Lato', src: lato });
Font.register({ family: 'Source Sans Pro', src: sourceSansPro });

const styles = StyleSheet.create({
  section: {
    margin: '16px'
  },
  bodoniModa: {
    fontFamily: 'Bodoni Moda'
  },
  lato: {
    fontFamily: 'Lato'
  },
  sourceSansPro: {
    fontFamily: 'Source Sans Pro'
  },
  helvetica: {
    fontFamily: 'Helvetica'
  }
});

const text =
  'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent sed leo ornare, finibus nisi sit amet, egestas enim. Proin posuere augue ut turpis sodales, eget eleifend lectus consectetur.';

const Doc = () => (
  <Document>
    <Page size="A4">
      <View style={styles.section}>
        <View style={styles.helvetica}>
          <Text>Helvetica</Text>
          <Text>{text}</Text>
        </View>
      </View>
      <View style={styles.section}>
        <View style={styles.bodoniModa}>
          <Text>Bodoni Moda</Text>
          <Text>{text}</Text>
        </View>
      </View>
      <View style={styles.section}>
        <View style={styles.lato}>
          <Text>Lato</Text>
          <Text>{text}</Text>
        </View>
      </View>

      <View style={styles.section}>
        <View style={styles.sourceSansPro}>
          <Text>Source Sans Pro</Text>
          <Text>{text}</Text>
        </View>
      </View>
      <View style={styles.section}>
        <View style={styles.helvetica}>
          <Text>Helvetica</Text>
          <Text>{text}</Text>
        </View>
      </View>
    </Page>
  </Document>
);

const generateDoc = async () => {
  return pdf(<Doc />).toBlob();
};

const blob = await generateDoc();

Bun.write(`${OUT_DIR}/${new Date().toISOString()}.pdf`, blob);

It has this issue whether you use Bun.write or the "render" function from the pdf-react lib. The issue seems to lie somewhere else.

Here is the comparison of running "pdffonts" against the bun and the node file.

❯ pdffonts example.pdf
Syntax Error (1161): Dictionary key must be a name object
Syntax Error (1163): Dictionary key must be a name object
Syntax Error (1165): Dictionary key must be a name object
Syntax Error (1167): Dictionary key must be a name object
Syntax Error (1169): Dictionary key must be a name object
Syntax Error (1171): Dictionary key must be a name object
Syntax Error (1173): Dictionary key must be a name object
Syntax Error (1175): Dictionary key must be a name object
Syntax Error (1177): Dictionary key must be a name object
Syntax Error (1187): Dictionary key must be a name object
Syntax Error (927): Dictionary key must be a name object
Syntax Error (929): Dictionary key must be a name object
Syntax Error (931): Dictionary key must be a name object
Syntax Error (933): Dictionary key must be a name object
Syntax Error (935): Dictionary key must be a name object
Syntax Error (937): Dictionary key must be a name object
Syntax Error (939): Dictionary key must be a name object
Syntax Error (941): Dictionary key must be a name object
Syntax Error (943): Dictionary key must be a name object
Syntax Error (958): Dictionary key must be a name object
Syntax Error (643): Dictionary key must be a name object
Syntax Error (645): Dictionary key must be a name object
Syntax Error (647): Dictionary key must be a name object
Syntax Error (649): Dictionary key must be a name object
Syntax Error (651): Dictionary key must be a name object
Syntax Error (653): Dictionary key must be a name object
Syntax Error (655): Dictionary key must be a name object
Syntax Error (657): Dictionary key must be a name object
Syntax Error (659): Dictionary key must be a name object
Syntax Error (666): Dictionary key must be a name object
Syntax Error (927): Dictionary key must be a name object
Syntax Error (929): Dictionary key must be a name object
Syntax Error (931): Dictionary key must be a name object
Syntax Error (933): Dictionary key must be a name object
Syntax Error (935): Dictionary key must be a name object
Syntax Error (937): Dictionary key must be a name object
Syntax Error (939): Dictionary key must be a name object
Syntax Error (941): Dictionary key must be a name object
Syntax Error (943): Dictionary key must be a name object
Syntax Error (958): Dictionary key must be a name object
Syntax Error (643): Dictionary key must be a name object
Syntax Error (645): Dictionary key must be a name object
Syntax Error (647): Dictionary key must be a name object
Syntax Error (649): Dictionary key must be a name object
Syntax Error (651): Dictionary key must be a name object
Syntax Error (653): Dictionary key must be a name object
Syntax Error (655): Dictionary key must be a name object
Syntax Error (657): Dictionary key must be a name object
Syntax Error (659): Dictionary key must be a name object
Syntax Error (666): Dictionary key must be a name object
Syntax Error (2175): Dictionary key must be a name object
Syntax Error (2177): Dictionary key must be a name object
Syntax Error (2179): Dictionary key must be a name object
Syntax Error (2181): Dictionary key must be a name object
Syntax Error (2183): Dictionary key must be a name object
Syntax Error (2185): Dictionary key must be a name object
Syntax Error (2187): Dictionary key must be a name object
Syntax Error (2189): Dictionary key must be a name object
Syntax Error (2191): Dictionary key must be a name object
Syntax Error (2193): Dictionary key must be a name object
Syntax Error (2195): Dictionary key must be a name object
Syntax Error (2197): Dictionary key must be a name object
Syntax Error (2207): Dictionary key must be a name object
Syntax Error (1601): Dictionary key must be a name object
Syntax Error (1603): Dictionary key must be a name object
Syntax Error (1605): Dictionary key must be a name object
Syntax Error (1607): Dictionary key must be a name object
Syntax Error (1609): Dictionary key must be a name object
Syntax Error (1611): Dictionary key must be a name object
Syntax Error (1613): Dictionary key must be a name object
Syntax Error (1615): Dictionary key must be a name object
Syntax Error (1617): Dictionary key must be a name object
Syntax Error (1619): Dictionary key must be a name object
Syntax Error (1621): Dictionary key must be a name object
Syntax Error (1623): Dictionary key must be a name object
Syntax Error (1638): Dictionary key must be a name object
Syntax Error (1310): Dictionary key must be a name object
Syntax Error (1312): Dictionary key must be a name object
Syntax Error (1314): Dictionary key must be a name object
Syntax Error (1316): Dictionary key must be a name object
Syntax Error (1318): Dictionary key must be a name object
Syntax Error (1320): Dictionary key must be a name object
Syntax Error (1322): Dictionary key must be a name object
Syntax Error (1324): Dictionary key must be a name object
Syntax Error (1326): Dictionary key must be a name object
Syntax Error (1328): Dictionary key must be a name object
Syntax Error (1330): Dictionary key must be a name object
Syntax Error (1332): Dictionary key must be a name object
Syntax Error (1339): Dictionary key must be a name object
Syntax Error (1601): Dictionary key must be a name object
Syntax Error (1603): Dictionary key must be a name object
Syntax Error (1605): Dictionary key must be a name object
Syntax Error (1607): Dictionary key must be a name object
Syntax Error (1609): Dictionary key must be a name object
Syntax Error (1611): Dictionary key must be a name object
Syntax Error (1613): Dictionary key must be a name object
Syntax Error (1615): Dictionary key must be a name object
Syntax Error (1617): Dictionary key must be a name object
Syntax Error (1619): Dictionary key must be a name object
Syntax Error (1621): Dictionary key must be a name object
Syntax Error (1623): Dictionary key must be a name object
Syntax Error (1638): Dictionary key must be a name object
Syntax Error (1310): Dictionary key must be a name object
Syntax Error (1312): Dictionary key must be a name object
Syntax Error (1314): Dictionary key must be a name object
Syntax Error (1316): Dictionary key must be a name object
Syntax Error (1318): Dictionary key must be a name object
Syntax Error (1320): Dictionary key must be a name object
Syntax Error (1322): Dictionary key must be a name object
Syntax Error (1324): Dictionary key must be a name object
Syntax Error (1326): Dictionary key must be a name object
Syntax Error (1328): Dictionary key must be a name object
Syntax Error (1330): Dictionary key must be a name object
Syntax Error (1332): Dictionary key must be a name object
Syntax Error (1339): Dictionary key must be a name object
Syntax Error (1161): Dictionary key must be a name object
Syntax Error (1163): Dictionary key must be a name object
Syntax Error (1165): Dictionary key must be a name object
Syntax Error (1167): Dictionary key must be a name object
Syntax Error (1169): Dictionary key must be a name object
Syntax Error (1171): Dictionary key must be a name object
Syntax Error (1173): Dictionary key must be a name object
Syntax Error (1175): Dictionary key must be a name object
Syntax Error (1177): Dictionary key must be a name object
Syntax Error (1187): Dictionary key must be a name object
Syntax Error (2175): Dictionary key must be a name object
Syntax Error (2177): Dictionary key must be a name object
Syntax Error (2179): Dictionary key must be a name object
Syntax Error (2181): Dictionary key must be a name object
Syntax Error (2183): Dictionary key must be a name object
Syntax Error (2185): Dictionary key must be a name object
Syntax Error (2187): Dictionary key must be a name object
Syntax Error (2189): Dictionary key must be a name object
Syntax Error (2191): Dictionary key must be a name object
Syntax Error (2193): Dictionary key must be a name object
Syntax Error (2195): Dictionary key must be a name object
Syntax Error (2197): Dictionary key must be a name object
Syntax Error (2207): Dictionary key must be a name object
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
NOPJPT+                              CID TrueType      Identity-H       yes no  yes     11  0
VIRUHA+                              CID TrueType      Identity-H       yes no  yes     13  0
Helvetica-Bold                       Type 1            WinAnsi          no  no  no      14  0

❯ pdffonts build/example.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
LZTAVW+Inter-Bold                    CID TrueType      Identity-H       yes yes yes     11  0
RXLBLQ+Inter-Regular                 CID TrueType      Identity-H       yes yes yes     13  0
Helvetica-Bold                       Type 1            WinAnsi          no  no  no      14  0

example.pdf is the bun version, while build/example.pdf is the node version.

What is the expected behavior?

Fonts should be properly included like node.

What do you see instead?

Fonts are not properly bundled, so when viewing the pdf they fallback to something else, and if your pdf reader does not do fallbacks, the text simply does not appear when viewing the pdf.

Additional information

Related issue on react-pdf repo.

https://github.com/diegomura/react-pdf/issues/2429

ttmx commented 5 months ago

Seems this also shows some other errors.

Making a larger pdf results in an out of bounds memory access.

RuntimeError: Out of bounds memory access (evaluating 'd.apply(null, p)') at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code])

I believe this may be some sort of off by 1 error which gets larger with larger pdfs?

af commented 4 months ago

@ttmx Thanks for the context about the font embedding, I was able to use that info to track down where the bug is occurring in the react-pdf codebase.

The problematic line is https://github.com/diegomura/react-pdf/blob/master/packages/pdfkit/src/font/embedded.js#L177, where this.font.postscriptName is a Buffer. When I log that value when running my react-pdf script in bun, it looks like this:

Buffer(44) [ 0, 69, 0, 66, 0, 71, 0, 97, 0, 114, 0, 97, 0, 109, 0, 111, 0, 110, 0, 100, 0, 83, 0, 67, 0, 48, 0, 56, 0, 45, 0, 82, 0, 101, 0, 103, 0, 117, 0, 108, 0, 97, 0, 114 ]

So it appears that this is a 16-bit encoded string in the buffer, but when implicitly converting that buffer to a string via const name = tag + '+' + this.font.postscriptName;, Bun assumes it's utf8 and as a result there are null bytes in the string that mess up the font embedding. You can verify this by opening the resulting pdf in a text editor and searching for BaseFont, you'll see the null bytes in the corresponding value.

I'm not sure if Node infers the text encoding in a buffer automatically in a way that Bun does not, as that might explain the divergence of behavior. I tried explicitly calling toString(encoding) on the buffer instead, with a few different encodings, but that didn't seem to help.

A really hacky fix that works is to convert to string and strip out the null bytes:

var name = tag + '+' + String(this.font.postscriptName).replace(/\0/g, '')

However I'm hoping someone more familiar with Bun's buffer handling will be able to diagnose and treat the underlying issue. I think using StringDecoder might help, but as react-pdf needs to support in-browser use cases, I'm not sure if it's appropriate.

EDIT: dug a little deeper and actually the root problem is exactly as described here: https://github.com/oven-sh/bun/issues/8252. So this should be fixed when https://github.com/oven-sh/bun/issues/6084 is resolved