pypdfium2-team / pypdfium2

Python bindings to PDFium
https://pypdfium2.readthedocs.io/
425 stars 17 forks source link

Provide a source build fallback #19

Closed mara004 closed 2 years ago

mara004 commented 2 years ago

Add a generic source build strategy for platforms where we don't have pre-built binaries:

mara004 commented 2 years ago

I've started experimenting in the sourcebuild branch. The build appears to work, but the generated binary - for some peculiar reason - is much smaller than it should be, and not even the FPDF_InitLibrary call works. I'll have to investigate this. My guess is that I need to apply more patches from pdfium-binaries.

adam-huganir commented 2 years ago

I look forward to seeing out it works, the ctypesgen builds + the binaries are always temperamental for me for any project I use them for

mara004 commented 2 years ago

Yeah, I concur. Being dependent on external binaries is a big limitation, that's why I started working on this...

mara004 commented 2 years ago

@adam-huganir I've pushed a commit incorporating more patches (5319f40). The generated binary now works and our tests pass. (Some of the patches apparently can't be applied, but I guess that's an upstream issue. Update: fixed now - I just used the wrong command to apply the patches.)

mara004 commented 2 years ago

@adam-huganir I believe that the source build script should be pretty solid now. In case you haven't done yet, could you maybe try it on your device, to confirm that PDFium builds correctly just by running the script, to make sure I haven't missed any external dependencies?

adam-huganir commented 2 years ago

Ok, so fresh clone of main, in a fresh python 3.8.12 environment:

  1. Source build works with no issues, though I have a bunch of dev dependencies in general on my computer, so this should probably be tested in a vm at some point
  2. because sourcebuild generates libpdfium.so library instead of one named pdfium I had to change the load line in pypdfium2/_pypdfium.py
   806 # Begin libraries
>> 807 _libs["pdfium"] = load_library("libpdfium.so") # from 'pdfium'

so you will need to change wherever that is generated (I guess cytpesgen call time?)

After changing that and installing Pillow everything worked as expected :+1:

mara004 commented 2 years ago

Thanks for testing!

because sourcebuild generates libpdfium.so library instead of one named pdfium I had to change the load line in pypdfium2/_pypdfium.py

I already thought about that when writing the script, and had implemented the option --destname to allow for renaming the binary, e. g. to pdfium. However, on my machine it was also possible to load libpdfium.so, but that might be because I'm using a newer version of ctypesgen that I installed from its current git repository. From looking at the library loader in the generated bindings file, this naming pattern should be detected:

class PosixLibraryLoader(LibraryLoader):
    """Library loader for POSIX-like systems (including Linux)"""
    _ld_so_cache = None
    _include = re.compile(r"^\s*include\s+(?P<pattern>.*)")
    name_formats = ["lib%s.so", "%s.so", "%s"]
mara004 commented 2 years ago

After changing that and installing Pillow

Oh, that's a good catch - I totally forgot to list Pillow as installation requirement in setup.cfg! I'll fix that soon.

mara004 commented 2 years ago

Source build works with no issues, though I have a bunch of dev dependencies in general on my computer, so this should probably be tested in a vm at some point

I just successfully built PyPDFium2 in VirtualBox with OpenSUSE Leap 15.3. However, I yet found some important external dependencies that I've listed in DEPS.txt now.