Bladieblah / xpdf-python

Python wrapper around the pdftotext functionality of xpdf
GNU General Public License v3.0
2 stars 2 forks source link

Building Failed on Windows #3

Closed ReMiOS closed 1 year ago

ReMiOS commented 1 year ago

I am trying to build with Python 3.9 on a Windows 11 system with Visual Studio 2022

But after running setup.py build --compiler=msvc

The build fails with following error: src\xpdf-4.04\goo\gfile.cc(733): error C3861: 'fseeko': identifier not found src\xpdf-4.04\goo\gfile.cc(745): error C3861: 'ftello': identifier not found error: command 'C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2

I've managed to install it on my Ubuntu server without issues and the output from a sample PDF sile looks promosing

Any help on how to build on Windows is appreciated :-)

Bladieblah commented 1 year ago

Hey, thanks for checking it out!

In the file src/xpdf-4.04/aconf.h there are 3 compiler flags,

#define HAVE_FSEEKO 1
#define HAVE_FSEEK64 0
#define HAVE_FSEEKI64 0

Could you clone the repo and try building pip install . with the other 2 options? make sure only 1 of the flags is set to 1 at a time!

ReMiOS commented 1 year ago

Thanks for your reply

The HAVE_FSEEKI64 compiler flag in aconf.h seems to fix the fseek error But now the build fails on ImageInfoDev.cc ...

#define HAVE_FSEEKO 1

define HAVE_FSEEK64 0

define HAVE_FSEEKI64 0

src\xpdf-4.04\goo\gfile.cc(733): error C3861: 'fseeko': identifier not found src\xpdf-4.04\goo\gfile.cc(745): error C3861: 'ftello': identifier not found

define HAVE_FSEEKO 0

#define HAVE_FSEEK64 1

define HAVE_FSEEKI64 0

src\xpdf-4.04\goo\gfile.cc(735): error C3861: 'fseek64': identifier not found src\xpdf-4.04\goo\gfile.cc(747): error C3861: 'ftell64': identifier not found

define HAVE_FSEEKO 0

define HAVE_FSEEK64 0

#define HAVE_FSEEKI64 1 ImageInfoDev.cc src\xpydf\ImageInfoDev.cc(46): error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax

ReMiOS commented 1 year ago

Update:

I don't know if this helps but i tried to fix this issue as discussed on the links below: https://stackoverflow.com/questions/33270731/error-c4576-in-vs2015-enterprise https://github.com/Azure/azure-sdk-for-c/pull/1885

# I changed the files below what seems to fix the C4576 errors

ImageInfoDev.cc line 43: images.push_back((ImageInfo) { ===> images.push_back(ImageInfo {

PdfLoader.cc line 123: pagesInfo.push_back((PageImageInfo){ ===> pagesInfo.push_back(PageImageInfo {

# This fixes the build of ImageInfoDev.cc and PdfLoader.cc But now the build fails with ....

Creating library build\temp.win-amd64-3.9\Release\src\xpdf-4.04\fofi\cXpdfPython.cp39-win_amd64.lib and object build\temp.win-amd64-3.9\Release\src\xpdf-4.04\fofi\cXpdfPython.cp39-win_amd64.exp gfile.obj : error LNK2001: unresolved external symbol imp_CoInitialize gfile.obj : error LNK2001: unresolved external symbol __imp_CoUninitialize gfile.obj : error LNK2001: unresolved external symbol imp_CoCreateInstance gfile.obj : error LNK2001: unresolved external symbol imp_CommandLineToArgvW GlobalParams.obj : error LNK2001: unresolved external symbol __imp_RegEnumValueA GlobalParams.obj : error LNK2001: unresolved external symbol imp_RegOpenKeyExA GlobalParams.obj : error LNK2001: unresolved external symbol __imp_RegCloseKey build\lib.win-amd64-3.9\cXpdfPython.cp39-win_amd64.pyd : fatal error LNK1120: 7 unresolved externals error: command 'C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\link.exe' failed with exit code 1120

ReMiOS commented 1 year ago

Found it !

The linker needs some extra libraries

You can alter the gfile.cc from the xpdf source, by adding three pragma lines #ifdef _WIN32 # undef WIN32_LEAN_AND_MEAN # include # include # include # include # include # pragma comment( lib, "Ole32.lib" ) # pragma comment( lib, "shell32.lib" ) # pragma comment( lib, "AdvAPI32.lib" )

#

But a better approach is to change the setup.py to include the libraries needed Also added the library xpydf name and version

cXpdfPython =[ Extension( .... .... libraries=['Ole32','AdvAPI32','shell32',], .... ) ]

setup( name='xpydf', version='0.1.0', ext_modules=cXpdfPython )

Bladieblah commented 1 year ago

Hey there! I merged up your PRs, they looked fine and I'll test them myself as well. I don't want to modify the xpdf source in case it gets updated so I agree that modifying the setup script is the better approach. I don't have a windows machine to test on so if you want to you can make a PR.

ReMiOS commented 1 year ago

Thanks !

I've made a PR for the setup.py so it will have the proper linker libraries. ( did it earlier also, but somehow the changes were lost ....)

Why did you disable the "#define MULTITHREADED" in the aconf.h ?

    -DMULTITHREADED=0
        Disables multithreading, which also disables building the GUI
        viewer (xpdf).  This does not affect the command line tools.
        Disabling multithreading should only be necessary if you're
        building with a compiler other than gcc, clang, or Microsoft
        Visual Studio.
Bladieblah commented 1 year ago

Great, I'll have a look.

The multithreading does not affect the functionalities that this package uses so just to be sure I turned it off, since I don't want it for my application.

Bladieblah commented 1 year ago

I merged it up, I think that closes the issue right?

ReMiOS commented 1 year ago

Fixed, i can confirm it now builds on Windows with Microsoft Visual Studio.