Open samsieber opened 2 hours ago
I have further isolated the issue. It appears to be when we copy data over. Here's a new loop I've been using:
// Add text line-by-line, handling newlines
for line in text.lines() {
log::info!("Creating text line: '{}'", line);
let mut text_object = PdfPageTextObject::new(&document, line, font, font_size)?;
log::info!("Reading before attaching: '{}'", text_object.text());
text_object.set_text(line).unwrap();
log::info!("Reading overwritten before attaching: '{}'", text_object.text());
text_object.set_fill_color(PdfColor::new(0, 0, 0, 255))?; // Set text color to black
// Position the text on the page
text_object.translate(PdfPoints::new(0.0), y_offset)?;
y_offset -= PdfPoints::new(font_size.value * 1.4); // Adjust `y_offset` with `PdfPoints`
// Add the text object to the page
let mut to = page.objects_mut().add_text_object(text_object)?;
log::info!("Reading after attaching: '{}'", to.as_text_object().unwrap().text());
to.as_text_object_mut().unwrap().set_text(line).unwrap();
log::info!("Reading overwritten after attaching: '{}'", to.as_text_object().unwrap().text());
}
And here's the output for that:
Creating text line: 'new test' [pdfium_render_text_garbage.js:454:13](http://localhost:4000/pdfium_render_text_garbage.js)
Reading before attaching: '' [pdfium_render_text_garbage.js:454:13](http://localhost:4000/pdfium_render_text_garbage.js)
Reading overwritten before attaching: '' [pdfium_render_text_garbage.js:454:13](http://localhost:4000/pdfium_render_text_garbage.js)
Reading after attaching: 'new testÿÿK' [pdfium_render_text_garbage.js:454:13](http://localhost:4000/pdfium_render_text_garbage.js)
Reading overwritten after attaching: 'new test' [pdfium_render_text_garbage.js:454:13](http://localhost:4000/pdfium_render_text_garbage.js)
Notably, I cannot fix the text after noticing that it was added incorrectly.
Details:
pdfium-render = { version = "0.8.26", default-features = false, features = ["thread_safe", "image_025", "image", "pdfium_6666"]}
I am using pdfium-render to render both normal pdfs and pdfs I generate for the purpose of generating text overlays in an image related application I'm using. When I do both, I sometimes get extra data at the end of the text I generate.
So, I might call
PdfPageTextObject::new(&document, "new test", font, font_size)
but if I later call.text()
on said PdfPageTextObject, I could get back"new testÿÿK"
. My theory at this point is that there's something to do with memory management going astray, but I'm not sure.My reproduction repository is a simplification of code I'm using in my own code. My own code targets wasm and native; I've only tried to replicate it for wasm because in my own application I've only ever seen it crop up on wasm.
Here's the core of the code that creates the text object after adding them all, checks them all. It sometimes, but not always, differs.
The first loop is where I set the text, and the second loops checks the text; sometimes the two loops print different sets of strings, but the difference is always that there's more text than what I expect.
See the replication repository for more details; it's a modified version of the wasm example from this repository. It loads a normal pdf, renders it, and then tries to build a pdf in memory with text. If I reverse that order, the issue goes away.