libvips / pyvips

python binding for libvips using cffi
MIT License
649 stars 50 forks source link

Invalid pointer to free on import #423

Open stumpylog opened 1 year ago

stumpylog commented 1 year ago

I'm investingating switching to libvips over convert for generating thumbnails from PDFs, using pyvips. However, when pyvips is imported into my celery worker, the worker dies with a signal 6 and a single log: free(): invalid pointer.

This is simply from the import, it happens even after removing the code to read the PDF and make the thumbnail. I suspect this is something with the code being run inside a celery worker, though I'm not quite sure how that works.

>> pip show pyvips
Name: pyvips
Version: 2.2.1
Summary: binding for the libvips image processing library, API mode
Home-page: https://github.com/libvips/pyvips
Author: John Cupitt
Author-email: jcupitt@gmail.com
License: MIT
Location: /usr/local/lib/python3.11/site-packages
Requires: cffi, pkgconfig
Required-by:

>> dpkg -l | grep vips
ii  libvips42:amd64                   8.14.1-3                       amd64        image processing system good for very large ones
jcupitt commented 1 year ago

Hi @stumpylog,

It sounds like a library version conflict, you'd need to give more details about how you made the runtime. Can you make a dockerfile that reproduces this issue, for example?

stumpylog commented 1 year ago

Since celery depends on a broker, like Redis, it's going to be hard to get a minimal example. I know this version was build from the sdist, against the libvips-dev provided by Bookworm, and libvips aka libvips42 is installed. Both are the same version: 8.14.1-3

stumpylog commented 1 year ago

I should also note, entering a shell in the container, importing and generating a thumbnail there works fine. If I move the import from the top of the file, to just the function used, that's also fine.

stumpylog commented 1 year ago

Overriding the logging, I get this several times during container startup:

paperless-ngx-dev-webserver  | Loaded binary module _libvips
paperless-ngx-dev-webserver  | Module generated for libvips 8.14
paperless-ngx-dev-webserver  | Linked to libvips 8.14
paperless-ngx-dev-webserver  | Inited libvips

No logging from the worker itself, that I can tell

jcupitt commented 1 year ago

You can have issues if you mix fork() and threads, could that be it? You need to not do any image processing outside the request handler.

This should work:

  1. import vips
  2. fork() (I think ngx will do this when a request comes in)
  3. process some images
  4. request handled, request process exits

And this as well:

  1. ngx starts up, request comes in
  2. fork()
  3. import vips
  4. process some image
  5. request handled, request process exits

But this will fail badly:

  1. import vips
  2. do almost anything with pyvips
  3. fork() for a request

Though I'm a big vague about ngx, perhaps it uses persistent request handlers rather than forking each time?

jcupitt commented 1 year ago

You could also try reordering your imports and see if that makes a difference.

stumpylog commented 1 year ago

I don't think anything is mixing processes and threads. The application doesn't ever directly control those, it's up to gunicorn and celery I think.

The processing isn't done from the request at all. It's packaged there and stored in Redis for the worker to get. At startup, there's 1 worker already up as a process. It's this process that dies with SIGABORT. Unfortunately, without being a celery developer, I don't know when or if this worker has done import pyvips though.

jcupitt commented 1 year ago

I've just found a slightly similar issue in rails caused by a clash between pdfium and libheif. Could that be it? What do you see for vips --vips-config?

stumpylog commented 1 year ago

Hope it helps:

enable debug: false
enable deprecated: true
enable modules: true
enable cplusplus: true
enable RAD load/save: true
enable Analyze7 load/save: true
enable PPM load/save: true
enable GIF load: true
use fftw for FFTs: true
accelerate loops with ORC: true
ICC profile support with lcms: true
zlib: true
text rendering with pangocairo: true
font file support with fontconfig: true
EXIF metadata support with libexif: true
JPEG load/save with libjpeg: true
JXL load/save with libjxl: true (dynamic module: true)
JPEG2000 load/save with OpenJPEG: true
PNG load/save with libspng: false
PNG load/save with libpng: true
selected quantisation package: imagequant
TIFF load/save with libtiff: true
image pyramid save with libgsf: true
HEIC/AVIF load/save with libheif: true (dynamic module: true)
WebP load/save with libwebp: true
PDF load with PDFium: false
PDF load with poppler-glib: true (dynamic module: true)
SVG load with librsvg: true
EXR load with OpenEXR: true
OpenSlide load: true (dynamic module: true)
Matlab load with libmatio: true
NIfTI load/save with niftiio: false
FITS load/save with cfitsio: true
GIF save with cgif: true
selected Magick package: MagickCore (dynamic module: true)
Magick API version: magick6
Magick load: true
Magick save: true
jcupitt commented 1 year ago

Thanks! But it's not pdfium :(

brandoncc commented 1 year ago

I don't know if this will be helpful or not, but I was experiencing free(): invalid pointer in my Heroku build logs not long ago when I was working on this PR. In my case, I ended up needing to remove mini_racer from my ruby Gemfile. I realize this is for pyvips, but I'm hoping this might still be useful somehow. In my case, it very well may have been pdfium since I was adding that to the build as part of the PR.