extract_tables() error: java.lang.NoSuchMethodError #42

jzadra opened 7 years ago

jzadra commented 7 years ago

I've just installed tabulizer from github. I'm using MacOS Sierra. I also installed the legacy java from the link given on the install instructions:

When I use extract_tables(), I get the following error:

f <- system.file("examples", "data.pdf", package = "tabulizer")
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : 
7: stop(list(message = "java.lang.NoSuchMethodError:", 
       call = .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", 
           cl, .jcast(if (inherits(o, "jobjRef") || inherits(o, 
               "jarrayRef")) o else cl, "java/lang/Object"), .jnew("java/lang/String", 
               method), j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
           evalArray = FALSE), jobj = <S4 object of class "jobjRef">))
6: .Call(RJavaCheckExceptions, silent)
5: .jcheck(silent = FALSE)
4: .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, 
       .jcast(if (inherits(o, "jobjRef") || inherits(o, "jarrayRef")) o else cl, 
           "java/lang/Object"), .jnew("java/lang/String", method), 
       j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
       evalArray = FALSE)
3: .jrcall(x, name, ...)
2: spreadsheetExtractor$isTabular(page)
1: extract_tables(f)
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.3

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 other attached packages:
 [1] tabulizer_0.1.24    
 [7] readxl_0.1.1     dplyr_0.5.0      purrr_0.2.2      readr_1.0.0      tidyr_0.6.1      tibble_1.2      
[13] ggplot2_2.2.1    tidyverse_1.1.1  knitr_1.15.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9         git2r_0.18.0        plyr_1.8.4          tools_3.3.2         jsonlite_1.3       
 [6] evaluate_0.10       nlme_3.1-128        gtable_0.2.0        lattice_0.20-34     png_0.1-7          
[11] psych_1.6.12        DBI_0.5-1           parallel_3.3.2      haven_1.0.0         rJava_0.9-8        
[16] httr_1.2.1          xml2_1.1.1          hms_0.3             grid_3.3.2          R6_2.2.0           
[21] foreign_0.8-67      reshape2_1.4.2      modelr_0.1.0        magrittr_1.5        tabulizerjars_0.9.2
[26] assertthat_0.1      mnormt_1.5-5        rvest_0.3.2         colorspace_1.3-2    stringi_1.1.2      
[31] lazyeval_0.2.0      munsell_0.4.3       broom_0.4.2      
I'm facing the same issue on the same setup as @jzadra. Any thoughts on how to address this is greatly appreciated!

Having the same issue here!

@soedr @jrcunning Are you also on Mac? Versions?

Yes, on Mac. I also installed the legacy java using the same link as @jzadra. Here is my Session Info:

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.3 (Sierra)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tabulizer_0.1.24

loaded via a namespace (and not attached):
[1] tabulizerjars_0.9.2 tools_3.3.1         rJava_0.9-8         png_0.1-7  
@jrcunning Thanks. Can you tell me what version of Java you installed? (The underlying java library - tabula - has some updates and it looks like they're causing Mac-specific issues.)

@leeper Yes, on Mac. Followed same procedure as @jzadra and @jrcunning. Session info:

 R version 3.3.2 (2016-10-31)
 Platform: x86_64-apple-darwin13.4.0 (64-bit)
 Running under: macOS Sierra 10.12.3

 [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base     

 other attached packages:
 [1] tabulizer_0.1.24   devtools_1.12.0

 loaded via a namespace (and not attached):
 [1] reshape2_1.4.2       rJava_0.9-8          haven_1.0.0          lattice_0.20-34      colorspace_1.3-2     ghit_0.2.17          foreign_0.8-67       withr_1.0.2          DBI_0.5-1           
 [10] modelr_0.1.0         readxl_0.1.1         plyr_1.8.4           stringr_1.1.0        munsell_0.4.3        gtable_0.2.0         rvest_0.3.2          leaps_3.0            psych_1.6.12        
 [19] memoise_1.0.0        labeling_0.3         knitr_1.15.1         forcats_0.2.0        parallel_3.3.2       curl_2.3             broom_0.4.2          Rcpp_0.12.9          tabulizerjars_0.9.2 
 [28] scales_0.4.1         flashClust_1.01-2    scatterplot3d_0.3-38 jsonlite_1.2         mnormt_1.5-5         png_0.1-7            hms_0.3              digest_0.6.12        stringi_1.1.2       
 [37] grid_3.3.2           tools_3.3.2          magrittr_1.5         lazyeval_0.2.0       cluster_2.0.5        crayon_1.3.2         MASS_7.3-45          xml2_1.1.0           lubridate_1.6.0     
 [46] assertthat_0.1       httr_1.2.1           R6_2.2.0             nlme_3.1-128         git2r_0.18.0
Java SE 6

Facing the same issue. Would appreciate help.

It seems the latest version of the tabula library is likely the issue. Consider re-installing an older version of tabulizerjars with, for example:

ghit::install_github("ropensci/tabulizerjars@v0.9.0", verbose = TRUE)
I installed tabulizerjarms v0.9.0 but still getting the same error:

> traceback()
7: stop(list(message = "java.lang.NoSuchMethodError:", 
       call = .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", 
           cl, .jcast(if (inherits(o, "jobjRef") || inherits(o, 
               "jarrayRef")) o else cl, "java/lang/Object"), .jnew("java/lang/String", 
               method), j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
           evalArray = FALSE), jobj = <S4 object of class "jobjRef">))
6: .Call(RJavaCheckExceptions, silent)
5: .jcheck(silent = FALSE)
4: .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, 
       .jcast(if (inherits(o, "jobjRef") || inherits(o, "jarrayRef")) o else cl, 
           "java/lang/Object"), .jnew("java/lang/String", method), 
       j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
       evalArray = FALSE)
3: .jrcall(x, name, ...)
2: spreadsheetExtractor$isTabular(page)
1: extract_tables("data/raw/Appendix D. Counts of Scleractinians, Octocorals and Sponges during Base....pdf", 
       columns = columns, pages = p, guess = F) at #9

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.3 (Sierra)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
other attached packages:
[1] tabulizer_0.1.24    tabulizerjars_0.9.0 stringr_1.1.0       reshape2_1.4.1     

loaded via a namespace (and not attached):
[1] magrittr_1.5  plyr_1.8.4    tools_3.3.1   Rcpp_0.12.10  stringi_1.1.1 git2r_0.15.0  ghit_0.2.12  
[8] rJava_0.9-8   png_0.1-7 
Hey @leeper, tabula author here. Would love to help sorting this issue out, as the latest versions have significant perfomance and accuracy improvements.

Would it be possible to get a code snippet that reproduces the issue?


I have the same problem.


R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)

Code from ?extract_files

f <- system.file("examples", "data.pdf", package = "tabulizer")


stop(structure(list(message = "java.lang.NoSuchMethodError:", call = .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, .jcast(if (inherits(o, "jobjRef") || inherits(o, "jarrayRef")) o else cl, "java/lang/Object"), .jnew("java/lang/String", ...
.jcheck(silent = FALSE)
.jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, .jcast(if (inherits(o, "jobjRef") || inherits(o, "jarrayRef")) o else cl, "java/lang/Object"), .jnew("java/lang/String", method), j_p, j_pc, use.true.class = TRUE, evalString = simplify, ...
.jrcall(x, name, ...)

Thank you

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.3 (Sierra)


 f <- system.file("examples", "data.pdf", package = "tabulizer")


Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : 


7: stop(list(message = "java.lang.NoSuchMethodError:", 
       call = .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", 
           cl, .jcast(if (inherits(o, "jobjRef") || inherits(o, 
               "jarrayRef")) o else cl, "java/lang/Object"), .jnew("java/lang/String", 
               method), j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
           evalArray = FALSE), jobj = <S4 object of class "jobjRef">))
6: .Call(RJavaCheckExceptions, silent)
5: .jcheck(silent = FALSE)
4: .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, 
       .jcast(if (inherits(o, "jobjRef") || inherits(o, "jarrayRef")) o else cl, 
           "java/lang/Object"), .jnew("java/lang/String", method), 
       j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
       evalArray = FALSE)
3: .jrcall(x, name, ...)
2: spreadsheetExtractor$isTabular(page)
1: extract_tables(f)
@leeper I'm quite sure that this only happens in Java 6.

Tabula-0.9.2.jar renders the same error as generated by tabulizer when Java 6 is the default JRE (see below). This is solved when switching to Java 8; unfortunately this does not make tabulizer work.

Exception in thread "main" java.lang.NoSuchMethodError:
    at technology.tabula.TextChunk.isLtrDominant(
    at technology.tabula.TextElement.mergeWords(
    at technology.tabula.TextElement.mergeWords(
    at technology.tabula.extractors.BasicExtractionAlgorithm.extract(
    at technology.tabula.CommandLineApp$TableExtractor.extractTablesBasic(
    at technology.tabula.CommandLineApp$TableExtractor.extractTables(
    at technology.tabula.CommandLineApp.extractFile(
    at technology.tabula.CommandLineApp.extractFileTables(
    at technology.tabula.CommandLineApp.extractTables(
    at technology.tabula.CommandLineApp.main(
@jkeuskamp Can you run java -version on a terminal? I'd like to confirm if that error occurs only with Java 6.

It is indeed Java 6: java version "1.6.0_65" Java(TM) SE Runtime Environment (build 1.6.0_65-b14-468-11M4833) Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-468, mixed mode)

Thanks @jkeuskamp. Can't help you much about tabulizer, but tabula itself will work under a newer version of the Java virtual machine (7 or 8). Bear in mind that Java 6 has stopped receiving updates and support since 2013.

I'd recommend that you update your Java installation and try again.

@leeper @jzadra @jazzido
This solved the issue for me: I Installed JVM 8 and pointed RStudio to its installation as described here

to make tabulizer use JVM 8, edit the file ~/.profile and set the following env variables:

export JAVA_HOME=`/usr/libexec/java_home -v 1.8
export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/server

then restart macOS or set the variables manually and restart RStudio.

After trying the fix that solved things for jkeuskamp, I unfortunately still receive the "java.lang.NoSuchMethodError:" error.

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.4

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
other attached packages:
[1] tabulizer_0.1.24 rJava_0.9-8      tidyr_0.6.1      magrittr_1.5     dplyr_0.5.0     

loaded via a namespace (and not attached):
 [1] tabulizerjars_0.9.2 R6_2.2.0            assertthat_0.2.0    DBI_0.6-1          
 [5] tools_3.3.3         tibble_1.3.0        Rcpp_0.12.10        git2r_0.18.0       
 [9] ghit_0.2.17         png_0.1-7          

> traceback()
7: stop(list(message = "java.lang.NoSuchMethodError:", 
       call = .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", 
           cl, .jcast(if (inherits(o, "jobjRef") || inherits(o, 
               "jarrayRef")) o else cl, "java/lang/Object"), .jnew("java/lang/String", 
               method), j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
           evalArray = FALSE), jobj = <S4 object of class "jobjRef">))
6: .Call(RJavaCheckExceptions, silent)
5: .jcheck(silent = FALSE)
4: .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, 
       .jcast(if (inherits(o, "jobjRef") || inherits(o, "jarrayRef")) o else cl, 
           "java/lang/Object"), .jnew("java/lang/String", method), 
       j_p, j_pc, use.true.class = TRUE, evalString = simplify, 
       evalArray = FALSE)
3: .jrcall(x, name, ...)
2: spreadsheetExtractor$isTabular(page)
1: extract_tables(f2, pages = 2)
@supermdat could you confirm that java 8 is indeed called from R? You can check this by typing

.jcall("java/lang/System", "S", "getProperty", "java.runtime.version")

this should return someting like


The second number tells you whether java 6/7/8 is called.

Thanks very much, @jkeuskamp . That is indeed the difference.

Running java -version from terminal gives:

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

However, running this from within RStudio gives:

.jcall("java/lang/System", "S", "getProperty", "java.runtime.version")

I've done some searching but don't understand well how to have R call Java 8. I'll continue researching, but would appreciate any tips.

@leeper Just noticed that the README.Rmd mentions:

On Mac OS, you may need to install a particular version of Java prior to attempting to install tabulizer.

You probably want to remove this line, as tabulizer now requires Java 7/8

Found similar issue on rJava GitHub

Managed to force rJava to use 1.8 by running R CMD javareconf JAVA_CPPFLAGS=-I/System/Library/Frameworks/JavaVM.framework/Headers then reinstalling rJava from source with install.packages("rJava", "", type="source")

If you run into issues then loading rJava with something similar to error: unable to load shared object '/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rJava/libs/':

you then need to link the files together using: source sudo ln -f -s $(/usr/libexec/java_home)/jre/lib/server/libjvm.dylib /usr/local/lib

And finally, it should load and show 1.8, which on my system meant I could run tabulizer successfully(!),

Hi all,

Would there be any way at all of getting tabulizer to work with Java 6? I'm having the above NoSuchMethodError issue but on Windows with Java 6 installed (unfortunately can't update Java very easily as it's a work computer without admin rights) Perhaps an older version of tabulizer and/or tabulizerjars?

Thanks for any help anyone can offer!

@peterblair You can definitely try pulling an older version of tabulizerjars; the versions are all available as releases on GitHub and they are numbered to align with tabula-java release numbers, e.g.: ghit::install_github("ropensci/tabulizerjars@v0.8.0")

Just for info, installing tabulizerjars v0.8.0 works with Java 6 on Windows. Thanks very much!