ropensci / pdftools

Text Extraction, Rendering and Converting of PDF Documents
https://docs.ropensci.org/pdftools
Other
513 stars 69 forks source link

Installation failure, undefined symbol error #112

Closed tedmoorman closed 2 years ago

tedmoorman commented 2 years ago

I'm getting an "undefined symbol" error in trying to install pdftools 3.2.0.

Here is the installation:

install.packages("pdftools")
Installing package into ‘/projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5’
(as ‘lib’ is unspecified)
* installing *source* package ‘pdftools’ ...
** package ‘pdftools’ successfully unpacked and MD5 sums checked
Found pkg-config cflags and libs!
Using PKG_CFLAGS=-I/usr/include/poppler/cpp -I/usr/include/poppler  
Using PKG_LIBS=-lpoppler-cpp  
** libs
g++ -std=c++11 -I"/opt/R/R-3.5.3/lib64/R/include" -DNDEBUG -I/usr/include/poppler/cpp -I/usr/include/poppler   -I"/projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/Rcpp/include" -I/usr/local/include  -fvisibility=hidden -fPIC  -O3 -march=native -Wno-ignored-attributes -c RcppExports.cpp -o RcppExports.o
g++ -std=c++11 -I"/opt/R/R-3.5.3/lib64/R/include" -DNDEBUG -I/usr/include/poppler/cpp -I/usr/include/poppler   -I"/projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/Rcpp/include" -I/usr/local/include  -fvisibility=hidden -fPIC  -O3 -march=native -Wno-ignored-attributes -c bindings.cpp -o bindings.o
g++ -std=c++11 -shared -L/opt/R/R-3.5.3/lib64/R/lib -L/usr/local/lib64 -o pdftools.so RcppExports.o bindings.o -lpoppler-cpp -L/opt/R/R-3.5.3/lib64/R/lib -lR
installing to /projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/pdftools/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error: package or namespace load failed for ‘pdftools’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/pdftools/libs/pdftools.so':
  /projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/pdftools/libs/pdftools.so: undefined symbol: _ZNK7poppler7ustring9to_latin1B5cxx11Ev
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/pdftools’
* restoring previous ‘/projects/CI_Analysts/R/x86_64-pc-linux-gnu-library/3.5/pdftools’
Warning in install.packages :
  installation of package ‘pdftools’ had non-zero exit status

Here is my sessionInfo:

R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: OpenShift

Matrix products: default
BLAS: /opt/R/R-3.5.3/lib64/R/lib/libRblas.so
LAPACK: /opt/R/R-3.5.3/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] httr_1.4.2       compiler_3.5.3   magrittr_2.0.2   R6_2.5.1         assertthat_0.2.1 miniCRAN_0.2.16  tools_3.5.3      igraph_1.2.11   
[9] pkgconfig_2.0.3 

Here is my ~/.R/Makevars file:

MAKEFLAGS = -j8

## C++ flags
CXX=g++
CXX11=g++
CXX14=g++
CXX17=g++

CXXFLAGS=-O3 -march=native -Wno-ignored-attributes
CXX11FLAGS=-O3 -march=native -Wno-ignored-attributes
CXX14FLAGS=-O3 -march=native -Wno-ignored-attributes
CXX17FLAGS=-O3 -march=native -Wno-ignored-attributes

CXXPICFLAGS=-fPIC
CXX11PICFLAGS=-fPIC
CXX14PICFLAGS=-fPIC
CXX17PICFLAGS=-fPIC

CXX11STD=-std=c++11
CXX14STD=-std=c++14
CXX17STD=-std=c++17

## C flags
CC=gcc
CFLAGS=-O3 -march=native

## Fortran flags
FC=gfortran
F77=gfortran
FFLAGS=-O3 -march=native
FCFLAGS=-O3 -march=native

Any suggestions are appreciated!

jeroen commented 2 years ago

This usually happens when you are mixing incompatible compilers. What is your g++ --version? Is it different from /usr/bin/g++ --version?

Can you try setting CXX11=/usr/bin/g++ in your ~/.R/Makevars and see if that changes anything?

tedmoorman commented 2 years ago

They are different:

$ /usr/bin/g++ --version
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++ --version
g++ (GCC) 9.2.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

And if I set CXX11=/usr/bin/g++ in ~/.R/Makevars, then the install works!

I'm inclined to leave CXX11=g++ for future installations of other packages, since so many packages require something different than the base compiler. Your thoughts?

...
# CXX11=/usr/bin/g++ # had to do this to get the install for pdftools to work
CXX11=g++
...
jeroen commented 2 years ago

The problem is that GCC changed to a different C++ ABI at gcc-5. The external system libraries that are provided by your distro, in this case libpoppler, are build for the compiler that ships with your OS, and therefore not compatible with gcc 9.

I think it makes sense to keep your default at CXX11=g++, but for packages that link to a C++ system library that you install with yum from OpenShift, you may need to switch to /usr/bin/g++. Fortunately there are only a few of them.

tedmoorman commented 2 years ago

Thank you, Jeroen. And for all your hard work!

dchiu911 commented 1 year ago

Hi @jeroen I am coming across this error as well after the recent update in v3.3.2. My /usr/bin/g++ --version and g++ --version are the same:

g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

But my R was compiled using /gsc/software/linux-x86_64-centos7/gcc-7.2.0/bin/g++ which is v7.2.0 and it is this version of g++ that R automatically calls to install pdftools. I have tried adding CXX11=/usr/bin/g++ to my .R/Makevars to no avail as well. So the issue might be the incompatible compilers but I don't know what to do from here. This seems like a potentially common issue for many users with various system library dependencies and structure

jeroen commented 1 year ago

Can you try adding CXX=/usr/bin/g++ to your ~/.R/Makevars ? The latest version of pdftools now compiles with the default CXX instead of CXX11 (see news file)

dchiu911 commented 1 year ago

I get these errors instead:

bindings.cpp: In function ‘Rcpp::String ustring_to_utf8(poppler::ustring)’:
bindings.cpp:56:26: error: ‘std::string’ has no member named ‘back’
   if(str.length() && str.back() == '\f')
                          ^
bindings.cpp: In function ‘Rcpp::List poppler_pdf_info(Rcpp::RawVector, std::string, std::string)’:
bindings.cpp:146:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw, true));
   ^
bindings.cpp:146:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw, true));
                                    ^
bindings.cpp:146:73: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw, true));
                                                                         ^
bindings.cpp: In function ‘Rcpp::CharacterVector poppler_pdf_text(Rcpp::RawVector, std::string, std::string)’:
bindings.cpp:261:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:261:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:261:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp:264:5: error: ‘unique_ptr’ is not a member of ‘std’
     std::unique_ptr<poppler::page> p(doc->create_page(i));
     ^
bindings.cpp:264:34: error: expected primary-expression before ‘>’ token
     std::unique_ptr<poppler::page> p(doc->create_page(i));
                                  ^
bindings.cpp:264:57: error: ‘p’ was not declared in this scope
     std::unique_ptr<poppler::page> p(doc->create_page(i));
                                                         ^
bindings.cpp: In function ‘Rcpp::DataFrame poppler_pdf_pagesize(Rcpp::RawVector, std::string, std::string)’:
bindings.cpp:293:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:293:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:293:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp:303:5: error: ‘unique_ptr’ is not a member of ‘std’
     std::unique_ptr<poppler::page> p(doc->create_page(i));
     ^
bindings.cpp:303:34: error: expected primary-expression before ‘>’ token
     std::unique_ptr<poppler::page> p(doc->create_page(i));
                                  ^
bindings.cpp:303:57: error: ‘p’ was not declared in this scope
     std::unique_ptr<poppler::page> p(doc->create_page(i));
                                                         ^
bindings.cpp: In function ‘Rcpp::List poppler_pdf_fonts(Rcpp::RawVector, std::string, std::string)’:
bindings.cpp:325:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:325:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:325:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp: In function ‘Rcpp::List poppler_pdf_files(Rcpp::RawVector, std::string, std::string)’:
bindings.cpp:349:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:349:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:349:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp: In function ‘Rcpp::List poppler_pdf_toc(Rcpp::RawVector, std::string, std::string)’:
bindings.cpp:373:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:373:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:373:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp:375:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::toc> contents(doc->create_toc());
   ^
bindings.cpp:375:31: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::toc> contents(doc->create_toc());
                               ^
bindings.cpp:375:59: error: ‘contents’ was not declared in this scope
   std::unique_ptr<poppler::toc> contents(doc->create_toc());
                                                           ^
bindings.cpp: In function ‘Rcpp::RawVector poppler_render_page(Rcpp::RawVector, int, double, std::string, std::string, bool, bool)’:
bindings.cpp:386:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:386:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:386:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp:387:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::page> p(doc->create_page(pagenum - 1));
   ^
bindings.cpp:387:32: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::page> p(doc->create_page(pagenum - 1));
                                ^
bindings.cpp:387:65: error: ‘p’ was not declared in this scope
   std::unique_ptr<poppler::page> p(doc->create_page(pagenum - 1));
                                                                 ^
bindings.cpp: In function ‘std::vector<std::basic_string<char> > poppler_convert(Rcpp::RawVector, std::string, std::vector<int>, std::vector<std::basic_string<char> >, double, std::string, std::string, bool, bool, bool)’:
bindings.cpp:416:3: error: ‘unique_ptr’ is not a member of ‘std’
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
   ^
bindings.cpp:416:36: error: expected primary-expression before ‘>’ token
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                    ^
bindings.cpp:416:67: error: ‘doc’ was not declared in this scope
   std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
                                                                   ^
bindings.cpp:422:5: error: ‘unique_ptr’ is not a member of ‘std’
     std::unique_ptr<poppler::page> p(doc->create_page(pagenum - 1));
     ^
bindings.cpp:422:34: error: expected primary-expression before ‘>’ token
     std::unique_ptr<poppler::page> p(doc->create_page(pagenum - 1));
                                  ^
bindings.cpp:422:67: error: ‘p’ was not declared in this scope
     std::unique_ptr<poppler::page> p(doc->create_page(pagenum - 1));
                                                                   ^
make: *** [bindings.o] Error 1
ERROR: compilation failed for package ‘pdftools’
jeroen commented 1 year ago

Ah right, centos is really old. Can you try setting: CXX=/usr/bin/g++ -std=gnu++11

dchiu911 commented 1 year ago

Thanks that worked!