Closed frnsys closed 1 year ago
I played around with this some more and I can avoid the segfault by changing the config to:
let conf = PdfRenderConfig::new()
.render_form_data(false) // Added this
.render_annotations(false)
.scale_page_by_factor(3.);
Comparing the output of pdfinfo
for each file:
Pdf that segfaults:
Custom Metadata: yes
Metadata Stream: yes
Tagged: no
UserProperties: no
Suspects: no
Form: AcroForm
JavaScript: no
Pages: 10
Encrypted: no
Page size: 595.276 x 793.701 pts
Page rot: 0
File size: 3060469 bytes
Optimized: no
PDF version: 1.7
Pdf that doesn't segfault:
Custom Metadata: no
Metadata Stream: no
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 10
Encrypted: no
Page size: 595.276 x 793.701 pts
Page rot: 0
File size: 3031783 bytes
Optimized: no
PDF version: 1.7
So I guess it has something to do with AcroForm
?
Hi @frnsys , thank you for reporting the issue.
I cannot reproduce the problem on Arch Linux with Pdfium build 118.0.5989.0 sourced from https://github.com/bblanchon/pdfium-binaries/releases/tag/chromium%2F5989. Tell me about your operating system and runtime environment.
I used the following test code, which I believe is almost identical to what you originally provided. All I did was add a main()
and some logging statements.
use pdfium_render::prelude::*;
fn main() -> Result<(), PdfiumError> {
extract_annotations_images("./test.pdf")?;
extract_annotations_images("./test_2.pdf")?;
extract_annotations_images("./test_copy.pdf")?;
Ok(())
}
fn extract_annotations_images(path: &str) -> Result<(), PdfiumError> {
let pdfium = Pdfium::new(
Pdfium::bind_to_library(Pdfium::pdfium_platform_library_name_at_path("../pdfium/"))
.or_else(|_| Pdfium::bind_to_system_library())?,
);
let mut document = pdfium.load_pdf_from_file(path, None)?;
for (page_num, mut page) in document.pages_mut().iter().enumerate() {
println!("{}, page {}", path, page_num);
for i in 0..page.annotations().len() {
let annotation = page.annotations().get(i).unwrap();
if let PdfPageAnnotationType::Square = annotation.annotation_type() {
println!(" Annotation {}", i);
let bounds = annotation.bounds().unwrap();
let conf = PdfRenderConfig::new()
.render_annotations(false)
.scale_page_by_factor(3.);
let orig_crop = page.boundaries().crop().unwrap().bounds;
page.boundaries_mut().set_crop(bounds).unwrap();
{
// If I comment these two lines out, no segfault.
let bmap = page.render_with_config(&conf).unwrap();
bmap.as_image()
.save_with_format(
format!("./foo-{}-{}.png", page_num, i),
image::ImageFormat::Png,
)
.unwrap();
}
page.boundaries_mut().set_crop(orig_crop).unwrap();
}
}
}
Ok(())
// Segfault here
}
(PS I think from memory you can set a render clip as part of your render config, so you don't have to apply a crop boundary to each page. Using render config may be more convenient.)
Thanks for the tip!
As for the other details, I'm using Ubuntu 22.04, kernel 5.15.0-83-generic. What other details would be relevant?
Well, I'm grasping at straws here, but what version of rustc are you using to compile?
Not sure off the top of my head why Arch and Ubuntu would have different behaviour. Your Ubuntu machine isn't virtualised in any way is it?
Can I absolutely, 100% confirm that you can still reproduce the problem using my sample code above?
Bizarre, this is my rust info:
stable-x86_64-unknown-linux-gnu (default)
rustc 1.72.0 (5680fa18f 2023-08-23)
I still get the segfault with your code, but at least I have something that works on my end now. I'm guessing it's just some quirk with my setup so happy to close this if you want.
thanks for your help!
Cannot reproduce the problem on a fresh virtual install of Ubuntu 22.04 with rust 1.72.0. Admittedly a fresh install of Ubuntu includes kernel version 6.2.0-32 rather than 5.15.0-83, but I wouldn't have thought it should make a difference anyway.
Unless you have any additional suggestions for how I can reproduce the problem, I think I'm going to close this.
Strange, I'll keep investigating. But thank you for looking into it.
Hi, thanks for this library. I'm encountering a problem where I get a segfault when the PDF document or pages are dropped:
I've tried with both the master branch and version
8.10.0
.pdfium
is version118.0.5989.0
.As I was preparing this example I noticed something strange.
The original copy of the PDF I have (
test.pdf
in the attachments) segfaults with the bitmap lines, but a copy I made usingpdftk test.pdf cat 0-10 output test_copy.pdf
doesn't segfault. If I just create a direct copy withpdftk test.pdf cat 0-10 output test_copy_2.pdf
that copy does segfault.Any idea what could be going wrong?
pdfs.zip