Open ttmx opened 5 months ago
Seems this also shows some other errors.
Making a larger pdf results in an out of bounds memory access.
RuntimeError: Out of bounds memory access (evaluating 'd.apply(null, p)') at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code]) at WASM ([wasm code])
I believe this may be some sort of off by 1 error which gets larger with larger pdfs?
@ttmx Thanks for the context about the font embedding, I was able to use that info to track down where the bug is occurring in the react-pdf codebase.
The problematic line is https://github.com/diegomura/react-pdf/blob/master/packages/pdfkit/src/font/embedded.js#L177, where this.font.postscriptName
is a Buffer. When I log that value when running my react-pdf script in bun, it looks like this:
Buffer(44) [ 0, 69, 0, 66, 0, 71, 0, 97, 0, 114, 0, 97, 0, 109, 0, 111, 0, 110, 0, 100, 0, 83, 0, 67, 0, 48, 0, 56, 0, 45, 0, 82, 0, 101, 0, 103, 0, 117, 0, 108, 0, 97, 0, 114 ]
So it appears that this is a 16-bit encoded string in the buffer, but when implicitly converting that buffer to a string via const name = tag + '+' + this.font.postscriptName;
, Bun assumes it's utf8 and as a result there are null bytes in the string that mess up the font embedding. You can verify this by opening the resulting pdf in a text editor and searching for BaseFont
, you'll see the null bytes in the corresponding value.
I'm not sure if Node infers the text encoding in a buffer automatically in a way that Bun does not, as that might explain the divergence of behavior. I tried explicitly calling toString(encoding)
on the buffer instead, with a few different encodings, but that didn't seem to help.
A really hacky fix that works is to convert to string and strip out the null bytes:
var name = tag + '+' + String(this.font.postscriptName).replace(/\0/g, '')
However I'm hoping someone more familiar with Bun's buffer handling will be able to diagnose and treat the underlying issue. I think using StringDecoder might help, but as react-pdf needs to support in-browser use cases, I'm not sure if it's appropriate.
EDIT: dug a little deeper and actually the root problem is exactly as described here: https://github.com/oven-sh/bun/issues/8252. So this should be fixed when https://github.com/oven-sh/bun/issues/6084 is resolved
What version of Bun is running?
1.0.25+a8ff7be64
What platform is your computer?
Linux 6.7.0-arch3-1 x86_64 unknown
What steps can reproduce the bug?
Attempt to include custom fonts in a pdf file, using react-pdf.
This example is copy pasted from the issue in react-pdf's repo.
It has this issue whether you use Bun.write or the "render" function from the pdf-react lib. The issue seems to lie somewhere else.
Here is the comparison of running "pdffonts" against the bun and the node file.
example.pdf is the bun version, while build/example.pdf is the node version.
What is the expected behavior?
Fonts should be properly included like node.
What do you see instead?
Fonts are not properly bundled, so when viewing the pdf they fallback to something else, and if your pdf reader does not do fallbacks, the text simply does not appear when viewing the pdf.
Additional information
Related issue on react-pdf repo.
https://github.com/diegomura/react-pdf/issues/2429