ropensci / tabulapdf

Bindings for Tabula PDF Table Extractor Library
https://docs.ropensci.org/tabulapdf/
Apache License 2.0
545 stars 71 forks source link

extract_tables crashes R on macOS with Java 14.0.2 #124

Open gacolitti opened 3 years ago

gacolitti commented 3 years ago

I can install and load rJava as well as tabulizer. But when I run extract_tables the R session crashes and I receive the following message:

Screen Shot 2020-10-15 at 9 34 15 AM

Is tabulizer not compatible with Java 14?

output of sessionInfo():

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

loaded via a namespace (and not attached):
[1] compiler_4.0.2 tools_4.0.2  

Here is a reproducible example:

library(tabulizer)
library(dplyr)

# URL to relevant UC Davis study
almonds_sjvs_drip_2019_location <- 
  "https://coststudyfiles.ucdavis.edu/uploads/cs_public/cb/07/cb078774-fd91-4418-906e-f94dfbd84506/2019almondssjvsouth.pdf"

output <- extract_tables(guess = TRUE, 
                         file = almonds_sjvs_drip_2019_location)
hatdeck commented 3 years ago

With no code changes since 2018, it doesn't appear tabulizer is being actively maintained, and it is not surprising that it is not compatible with other current software. I'd suggest using pdftools instead.

marijnvanwingerden commented 3 years ago

Java SE6 is no longer supported by Mac OS Catalina, see: https://discussions.apple.com/thread/250725852