bblanchon / pdfium-binaries

📰 Binary distribution of PDFium
789 stars 166 forks source link

crash when also linked against debian C++ libraries #124

Closed jcupitt closed 7 months ago

jcupitt commented 8 months ago

Hello, first, thank you for this nice thing!

I've hit an issue when linking against both pdfium-linux-x64.tgz and libheif, a C++ library for loading HEIC and AVIF images as packaged by Debian. I imagine this issue will occur with any C++ library in the deb repo.

https://github.com/brandoncc/heroku-buildpack-vips/issues/41#issuecomment-1722462354

What seems to be happening is that the allocators in the pdfium binary and Debian's libheif are getting mixed up, I suppose because the C++ compilers do not have a compatible ABI.

I think I could fix this by building my own libheif with the same compiler and compiler version that was used to build pdfium, though it's not immediately obvious what this is.

Maybe the pdfium download table could give the compiler that the binary is compatible with? Eg. clang-15, or gcc-12?

jcupitt commented 8 months ago

I could put together a reproducer for this crash, if that would be helpful.

bblanchon commented 8 months ago

Hi John,

Thank you very much for reporting this issue. I'm surprised that two shared libraries could conflict this way, but I'm not an expert. However, I doubt this is an ABI issue, so I prefer we get a confirmation of this before adding any information to the README. Hopefully, someone more knowledgeable will shed some light on this issue.

Best regards, Benoit

jcupitt commented 8 months ago

Hi @bblanchon,

I made a standalone reproducer:

https://github.com/jcupitt/docker-builds/tree/pdfium-crash/libvips-heroku22

If you run:

$ docker build -t libvips-heroku22 .

The build will end with a crash:

 => ERROR [22/22] RUN cd x   && echo 'require "vips"; Vips::Image.new_fro  3.6s
------
 > [22/22] RUN cd x   && echo 'require "vips"; Vips::Image.new_from_file("../demo.heic")' |      DISABLE_SPRING=1 bundle exec bin/rails c:
2.754 Loading development environment (Rails 6.1.4.1)
2.754 Switch to inspect mode.
2.755 require "vips"; Vips::Image.new_from_file("../demo.heic")
2.759 /var/lib/gems/3.0.0/gems/ruby-vips-2.1.4/lib/vips/operation.rb:225: [BUG] Segmentation fault at 0x000000000000001d
2.759 ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
2.759
2.759 -- Control frame information -----------------------------------------------
2.759 c:0052 p:---- s:0324 e:000323 CFUNC  :vips_cache_operation_build
2.759 c:0051 p:0012 s:0319 e:000318 METHOD /var/lib/gems/3.0.0/gems/ruby-vips-2.1.4/lib/vips/operation.rb:225

If I run in gdb, I see:

Thread 1 "ruby" hit Breakpoint 2, vips_foreign_load_heif_build (object=0x556f1aa49120) at ../libvips/foreign/heifload.c:248
248 {
(gdb) n
249     VipsForeignLoadHeif *heif = (VipsForeignLoadHeif *) object;
(gdb)
255     if( heif->source &&
(gdb)
256         vips_source_rewind( heif->source ) )
(gdb)
255     if( heif->source &&
(gdb)
259     if( !heif->ctx ) {
(gdb)
262         heif->ctx = heif_context_alloc();
(gdb)
265             heif->unlimited ? USHRT_MAX : 0x4000 );
(gdb)
264         heif_context_set_maximum_image_size_limit( heif->ctx,
(gdb)
268             heif->reader, heif, NULL );
(gdb)
267         error = heif_context_read_from_reader( heif->ctx,
(gdb)

Thread 1 "ruby" received signal SIGSEGV, Segmentation fault.
0x00007f79c35592b7 in allocator_shim::internal::PartitionFree(allocator_shim
::AllocatorDispatch const*, void*, void*) () from /usr/local/lib/libpdfium.so

So heif_context_read_from_reader(), part of the libheif API, is unexpectedly calling allocator_shim::internal::PartitionFree() in libpdfium.so and crashing.

jcupitt commented 8 months ago

Although it all works when you run ruby directly, curiously, you only get the crash when running in the rails console, which I don't understand.

The background here is that libvips is the default image processing library for Ruby on Rails, and it'd be great to be able to use PDFium to render PDF files. Unfortunately it seems you can't use both PDFium and libheif in the same binary inside rails, so, for now at least, rails has to stick with poppler.

jcupitt commented 8 months ago

A friend of mine thinks it might be a mixup between libc++ and libstdc++. Apparently, the pdfium build scripts should perhaps set use_custom_libcxx for glibc as well as musl. He might post here later today.

kleisauke commented 7 months ago

I think the issue is that the resulting library is linked against two different implementations of the C++ standard library, namely libc++ and libstdc++, which would lead to ODR violation and possibly mysterious behaviors.

By default, Chromium (and consequently PDFium) statically links against a custom in-tree libc++. This approach allows them to update the standard library independently, without being constrained by the minimum C++ standard library version available on various operating systems. https://github.com/chromium/chromium/blob/119.0.6019.3/build/config/c%2B%2B/c%2B%2B.gni#L10-L16

Setting use_custom_libcxx = false in the PDFium builds scripts will probably fix this, but this could cause compatibility issues. For example, compiling the binaries on Ubuntu 22.04 (providing libstdc++.so.6.0.30 / GLIBCXX_3.4.30) would make these pre-built binaries incompatible on Ubuntu 20.04 (providing libstdc++.so.6.0.28 / GLIBCXX_3.4.28).

(I'm currently doing a test release on commit https://github.com/kleisauke/pdfium-binaries/commit/9aa6e93fd6fd399f0eab567587b628fff21fa98f, to verify if that would resolve this issue)

kleisauke commented 7 months ago

It still crashes with use_custom_libcxx = false. :(

which would lead to ODR violation and possibly mysterious behaviors.

Actually, this is not a ODR violation. It's fine to statically link libc++ into libpdfium.so and link libheif.so against the shared libstdc++ library, as long as you don't pass C++ objects between them.

  graph TD;
      libvips.so.42-->libheif.so.1;
      libvips.so.42-->libpdfium.so;
      libpdfium.so-->libc++;
      libheif.so.1-->libstdc++.so.6;

So heif_context_read_from_reader(), part of the libheif API, is unexpectedly calling allocator_shim::internal::PartitionFree() in libpdfium.so and crashing.

The PartitionAlloc memory allocator can be disabled by building with pdf_use_partition_alloc = false. https://github.com/chromium/chromium/blob/119.0.6019.3/build_overrides/pdfium.gni#L17-L19

Perhaps Ruby on Rails uses a custom memory allocator that clashes with PartitionAlloc? I'll do another test release with commit https://github.com/kleisauke/pdfium-binaries/commit/44bd34945060ae7725fcb71d560af374eeba2ffc.

kleisauke commented 7 months ago

Setting pdf_use_partition_alloc = false seems to work! :tada:

Tested with:

$ docker build -t libvips-heroku22 --build-arg="PDFIUM_VERSION=6015-avoid-partition-alloc" --build-arg="PDFIUM_URL=https://github.com/kleisauke/pdfium-binaries/releases/download/chromium" .

I'll do another test release with commit https://github.com/kleisauke/pdfium-binaries/commit/049c36fadca06d7409b1c0bd56ede3a82f01fbbe, as I think that's the underlying issue.

kleisauke commented 7 months ago

I'll do another test release with commit https://github.com/kleisauke/pdfium-binaries/commit/049c36fadca06d7409b1c0bd56ede3a82f01fbbe, as I think that's the underlying issue.

That seems to be the culprit. Tested with:

$ docker build -t libvips-heroku22 --build-arg="PDFIUM_VERSION=6015-avoid-allocator-shim" --build-arg="PDFIUM_URL=https://github.com/kleisauke/pdfium-binaries/releases/download/chromium" .

PR #128 should fix this.