pridiltal / staplr

PDF Toolkit. :paperclip: :hammer: :wrench: :scissors: :bookmark_tabs: :file_folder::paperclip: :bookmark: :construction: :construction_worker:
https://pridiltal.github.io/staplr/
265 stars 27 forks source link

pdftk: not found #48

Closed kevpfowler closed 4 years ago

kevpfowler commented 4 years ago

My understanding from the README is that if I have JAVA installed (JDK and JRE), then staplr contains pdftk-java and I do not need to separately install pdftk.

But, when I run the README example, I get "sh: 1: pdftk: not found":

> pdfFile = system.file('testForm.pdf',package = 'staplr')
> pdfFile
[1] "/home/kfowler/R/x86_64-pc-linux-gnu-library/3.6/staplr/testForm.pdf"
> get_fields(pdfFile)
sh: 1: pdftk: not found
Error in file(con, "r") : cannot open the connection
In addition: Warning messages:
1: In system(system_command) : error in running command
2: In file(con, "r") :
  cannot open file '/tmp/RtmpZQhoUu/file165a71ea13f0b': No such file or directory

I am using staplr_2.9.0, with R 3.6.3, on a ubuntu 20.04 system. I ran javareconf, and restarted R before attempting the example above.

$ sudo R CMD javareconf
Java interpreter : /usr/lib/jvm/default-java/bin/java
Java version     : 11.0.7
Java home path   : /usr/lib/jvm/default-java
Java compiler    : /usr/lib/jvm/default-java/bin/javac
Java headers gen.: /usr/bin/javah
Java archive tool: /usr/lib/jvm/default-java/bin/jar

trying to compile and link a JNI program 
detected JNI cpp flags    : -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux
detected JNI linker flags : -L$(JAVA_HOME)/lib/server -ljvm
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG -I/usr/lib/jvm/default-java/include -I/usr/lib/jvm/default-java/include/linux    -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-jbaK_j/r-base-3.6.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c conftest.c -o conftest.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o conftest.so conftest.o -L/usr/lib/jvm/default-java/lib/server -ljvm -L/usr/lib/R/lib -lR

JAVA_HOME        : /usr/lib/jvm/default-java
Java library path: $(JAVA_HOME)/lib/server
JNI cpp flags    : -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux
JNI linker flags : -L$(JAVA_HOME)/lib/server -ljvm
Updating Java configuration in /usr/lib/R
Done.
kevpfowler commented 4 years ago

I installed pdftk-java separately on my system, and then the staplr::get_fields() command did not generate that error.

However, neither did it work correctly, returning and empty named list (list of length 0)

oganm commented 4 years ago

The readme is accurate for the development version. The CRAN version still requires pdftk installation. Could you try to install it from github and see if it works?

kevpfowler commented 4 years ago

The other thing I tried was to remove pdftk-java, and then installed pdftk itself, which also installed pdftk-java. I re-ran javareconf, and restart R/RStudio. I still got the same result (zero-length list reading fields from testForm.pdf)

I will try the development version and report back

kevpfowler commented 4 years ago

Ok I used the development version and it seems to work correctly. If I had paid full attention to the version qualifier in the README I could have skipped reporting this, since I was using 2.9.0 from CRAN and the README section specifically said it was for 2.11.0. Though I don't understand why 2.9.0 was not working with the externally installed pdftk.

Just an FYI: To install the development version successfully with R3.6.3, I had to first install XML specifying the repo.

install.packages("XML", repos = "http://www.omegahat.net/R")
oganm commented 4 years ago

@pridiltal did you have time to push the updated version to CRAN? the current dev version has been stable and easier to install and debug.

@kevpfowler any idea why you had to specify the repo when installing XML? For me devtools::install_github does the trick with or without XML installed and there doesn't seem to be any issues with CRAN servers, not right now at least

pridiltal commented 4 years ago

This is a great update of the package. Thank you @oganm . Just submitted to CRAN. I'll update you soon.

kevpfowler commented 4 years ago

@oganm I do not know why I had to specify the repo. I'm running on Ubuntu 20.04. Providing more details below.

This is the error I get when installing staplr with devtools::install_github:

> devtools::install_github("pridiltal/staplr")
Downloading GitHub repo pridiltal/staplr@master
Skipping 1 packages not available: XML
✓  checking for file ‘/tmp/RtmpcJMIRT/remotes405f95d4ed366/pridiltal-staplr-b5b9505/DESCRIPTION’ ...
─  preparing ‘staplr’:
✓  checking DESCRIPTION meta-information ...
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘staplr_3.0.0.tar.gz’

Installing package into ‘/home/kfowler/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
ERROR: dependency ‘XML’ is not available for package ‘staplr’
* removing ‘/home/kfowler/R/x86_64-pc-linux-gnu-library/3.6/staplr’
Error: Failed to install 'staplr' from GitHub:
  (converted from warning) installation of package ‘/tmp/RtmpcJMIRT/file405f9442e4176/staplr_3.0.0.tar.gz’ had non-zero exit status

This is the error I get when trying to install XML package the normal way:

> install.packages("XML")
Installing package into ‘/home/kfowler/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
Warning in install.packages :
  package ‘XML’ is not available (for R version 3.6.3)

This is my repos options setting:

> getOption("repos")
                         CRAN 
"https://cloud.r-project.org" 

This is the successful result specifying the omegahat repo:

> install.packages("XML", repos = "http://www.omegahat.net/R")
Installing package into ‘/home/kfowler/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
trying URL 'http://www.omegahat.net/R/src/contrib/XML_3.99-0.tar.gz'
Content type 'application/x-gzip' length 1547930 bytes (1.5 MB)
==================================================
downloaded 1.5 MB
...
...

After which, the staplr devtools::install_github() works fine.

oganm commented 4 years ago

oh sorry i see the reason now. Current CRAN version of XML requires R>4.0.0 which prevents it's installation from install.packages for you. If you're able to update your R the problem should go away.