Closed theodds closed 8 years ago
It works for me. You left a couple lines out. Otherwise the problem is in your setup.
options(java.parameters="-Xmx1000m") library("bartMachine") set_bart_machine_num_cores(4)
data(automobile) y <- automobile$log_price X <- automobile; X$log_price <- NULL bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE, use_missing_data_dummies_as_covars = TRUE)
bartMachine initializing with 50 trees...
Now building bartMachine for regression ...Covariate importance prior ON. Missing data feature ON. Missingness used as covariates.
building BART with mem-cache speedup...
Iteration 100/500 thread: 4
....
Iteration 500/500 thread: 1 done building BART in 2.065 sec burning and aggregating chains from all threads... done evaluating in sample data...done
On Tue, Aug 11, 2015 at 6:28 PM, theodds notifications@github.com wrote:
Running the code directly from the vignette, I get the following error when attempting to fit the model with missing covariates.
Error in .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, : java.lang.NullPointerException
Specifically, I ran
library("bartMachine") options(java.parameters="-Xmx1000m") set_bart_machine_num_cores(4) y <- automobile$log_price X <- automobile; X$log_price <- NULL bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE, use_missing_data_dummies_as_covars = TRUE)
Particularly confusing, because I ran this code on old versions of the package as well, so I'm unsure whether this is a problem with the package or a problem with my install. For reference, this issue also appears here https://github.com/mlr-org/mlr/issues/422.
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
I loaded the data, just forgot to include it in this post. Anyways, I suppose this is related to either my R install or JDK/rJava setup, but it seems strange. I only get the problem with missing data, the rest of the vignette material runs fine.
Did you try doing options first to specifiy the RAM and then load the package?
On Wed, Aug 12, 2015 at 12:43 AM, theodds notifications@github.com wrote:
I loaded the data, just forgot to include it in this post. Anyways, I suppose this is related to either my R install or JDK/rJava setup, but it seems strange. I only get the problem with missing data, the rest of the vignette material runs fine.
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130159280.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
Did you try doing options first to specifiy the RAM and then load the package?
From a fresh session I ran your code
#this line goes first
options(java.parameters="-Xmx1000m")
library("bartMachine")
set_bart_machine_num_cores(4)
#and you need to load the data
data(automobile)
y <- automobile$log_price
X <- automobile; X$log_price <- NULL
bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE,
use_missing_data_dummies_as_covars = TRUE)
and got the following output
Loading required package: rJava
Loading required package: car
Loading required package: randomForest
randomForest 4.6-10
Type rfNews() to see new features/changes/bug fixes.
Loading required package: missForest
Loading required package: foreach
foreach: simple, scalable parallel programming from Revolution Analytics
Use Revolution R for scalability, fault tolerance and more.
http://www.revolutionanalytics.com
Loading required package: itertools
Loading required package: iterators
Welcome to bartMachine v1.2.0! You have 0.93GB memory available.
bartMachine now using 4 cores.
bartMachine initializing with 50 trees...
Error in .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, :
java.lang.NullPointerException
Go into your R library folder and delete the folder bartMachine. Then reinstall and try again. Also, move your RAM up to something absurd like 4g if you can.
On Wed, Aug 12, 2015 at 1:04 AM, theodds notifications@github.com wrote:
Did you try doing options first to specifiy the RAM and then load the package?
From a fresh session I ran
this line goes first
options(java.parameters="-Xmx1000m") library("bartMachine") set_bart_machine_num_cores(4)
and you need to load the data
data(automobile) y <- automobile$log_price X <- automobile; X$log_price <- NULL bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE, use_missing_data_dummies_as_covars = TRUE)
and got the following output
Loading required package: rJava Loading required package: car Loading required package: randomForest randomForest 4.6-10 Type rfNews() to see new features/changes/bug fixes. Loading required package: missForest Loading required package: foreach foreach: simple, scalable parallel programming from Revolution Analytics Use Revolution R for scalability, fault tolerance and more.http://www.revolutionanalytics.com Loading required package: itertools Loading required package: iterators Welcome to bartMachine v1.2.0! You have 0.93GB memory available.
bartMachine now using 4 cores.
bartMachine initializing with 50 trees... Error in .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, : java.lang.NullPointerException
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130167101.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
I reinstalled the latest release on CRAN (after deleting the package both with remove.packages() and by deleting the folder directly) and get the same error after running
options(java.parameters="-Xmx4000m")
library("bartMachine")
set_bart_machine_num_cores(4)
#and you need to load the data
data(automobile)
y <- automobile$log_price
X <- automobile; X$log_price <- NULL
bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE,
use_missing_data_dummies_as_covars = TRUE)
(the only difference here being upping to 4GB of ram).
Debugging at the point of the error, it looks like (for whatever reason) the error occurs when adding the second entry:
> bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE,
+ use_missing_data_dummies_as_covars = TRUE)
bartMachine initializing with 50 trees...
Called from: build_bart_machine(X, y, Xy, num_trees, num_burn_in, num_iterations_after_burn_in,
alpha, beta, k, q, nu, prob_rule_class, mh_prob_steps, debug_log,
run_in_sample, s_sq_y, cov_prior_vec, use_missing_data, covariates_to_permute,
num_rand_samps_in_library, use_missing_data_dummies_as_covars,
replace_missing_data_with_x_j_bar, impute_missingness_with_rf_impute,
impute_missingness_with_x_j_bar_for_lm, mem_cache_for_speed,
serialize, seed, verbose)
Browse[1]> debug: for (i in 1:nrow(model_matrix_training_data)) {
.jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i,
]))
}
Browse[2]>
debug: i
Browse[2]> i
NULL
Browse[2]>
debug: .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i,
]))
Browse[2]> i
[1] 1
Browse[2]>
debug: i
Browse[2]>
debug: .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i,
]))
Browse[2]> i
[1] 2
Browse[2]>
Error in .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, :
java.lang.NullPointerException
Print out model_matrix_training_data and model_matrix_training_data[2, ] to see what's going on there.
On Wed, Aug 12, 2015 at 1:19 PM, theodds notifications@github.com wrote:
Debugging at the point of the error, it looks like (for whatever reason) the error occurs when adding the second entry:
bart_machine <- bartMachine(X=X, y=y, use_missing_data = TRUE,
- use_missing_data_dummies_as_covars = TRUE) bartMachine initializing with 50 trees... Called from: build_bart_machine(X, y, Xy, num_trees, num_burn_in, num_iterations_after_burn_in, alpha, beta, k, q, nu, prob_rule_class, mh_prob_steps, debug_log, run_in_sample, s_sq_y, cov_prior_vec, use_missing_data, covariates_to_permute, num_rand_samps_in_library, use_missing_data_dummies_as_covars, replace_missing_data_with_x_j_bar, impute_missingness_with_rf_impute, impute_missingness_with_x_j_bar_for_lm, mem_cache_for_speed, serialize, seed, verbose) Browse[1]> debug: for (i in 1:nrow(model_matrix_training_data)) { .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, ])) } Browse[2]> debug: i Browse[2]> i NULL Browse[2]> debug: .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, ])) Browse[2]> i [1] 1 Browse[2]> debug: i Browse[2]> debug: .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, ])) Browse[2]> i [1] 2 Browse[2]> Error in .jcall(java_bart_machine, "V", "addTrainingDataRow", as.character(model_matrix_training_data[i, : java.lang.NullPointerException
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130380723.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
The input to addTrainingDataRow for the second observation similar to the first observation (the same, except for the response value); at the point of error, I get:
Browse[2]> as.character(model_matrix_training_data[1, ])
[1] "3" NA "2" "88.6"
[5] "168.8" "64.1" "48.8" "2548"
[9] "4" "130" "3.47" "2.68"
[13] "9" "111" "5000" "21"
[17] "27" "0" "1" "1"
[21] "0" "1" "0" "0"
[25] "0" "0" "0" "0"
[29] "1" "1" "0" "1"
[33] "0" "0" "0" "0"
[37] "0" "0" "0" "0"
[41] "0" "0" "1" "0"
[45] "0" "1" "0" "0"
[49] "0" "0" "9.5100745254521"
Browse[2]> as.character(model_matrix_training_data[2, ])
[1] "3" NA "2" "88.6"
[5] "168.8" "64.1" "48.8" "2548"
[9] "4" "130" "3.47" "2.68"
[13] "9" "111" "5000" "21"
[17] "27" "0" "1" "1"
[21] "0" "1" "0" "0"
[25] "0" "0" "0" "0"
[29] "1" "1" "0" "1"
[33] "0" "0" "0" "0"
[37] "0" "0" "0" "0"
[41] "0" "0" "1" "0"
[45] "0" "1" "0" "0"
[49] "0" "0" "9.71111565988867"
The only thing weird I guess is that the NAs aren't converted to strings, but everything else is.
For completeness, getting rid of as.character I get
Browse[2]> model_matrix_training_data[1, ]
symboling normalized_losses num_doors
3.000000 NA 2.000000
wheel_base length width
88.600000 168.800000 64.100000
height curb_weight num_cylinders
48.800000 2548.000000 4.000000
engine_size bore stroke
130.000000 3.470000 2.680000
compression_ratio horsepower peak_rpm
9.000000 111.000000 5000.000000
city_mpg highway_mpg fuel_type_diesel
21.000000 27.000000 0.000000
fuel_type_gas aspiration_std aspiration_turbo
1.000000 1.000000 0.000000
body_style_convertible body_style_hardtop body_style_hatchback
1.000000 0.000000 0.000000
body_style_sedan body_style_wagon wheel_drive_4wd
0.000000 0.000000 0.000000
wheel_drive_fwd wheel_drive_rwd engine_location_front
0.000000 1.000000 1.000000
engine_location_rear engine_type_dohc engine_type_l
0.000000 1.000000 0.000000
engine_type_ohc engine_type_ohcf engine_type_ohcv
0.000000 0.000000 0.000000
engine_type_rotor fuel_system_1bbl fuel_system_2bbl
0.000000 0.000000 0.000000
fuel_system_4bbl fuel_system_idi fuel_system_mfi
0.000000 0.000000 0.000000
fuel_system_mpfi fuel_system_spdi fuel_system_spfi
1.000000 0.000000 0.000000
M_normalized_losses M_bore M_stroke
1.000000 0.000000 0.000000
M_horsepower M_peak_rpm y_remaining
0.000000 0.000000 9.510075
Browse[2]> model_matrix_training_data[2, ]
symboling normalized_losses num_doors
3.000000 NA 2.000000
wheel_base length width
88.600000 168.800000 64.100000
height curb_weight num_cylinders
48.800000 2548.000000 4.000000
engine_size bore stroke
130.000000 3.470000 2.680000
compression_ratio horsepower peak_rpm
9.000000 111.000000 5000.000000
city_mpg highway_mpg fuel_type_diesel
21.000000 27.000000 0.000000
fuel_type_gas aspiration_std aspiration_turbo
1.000000 1.000000 0.000000
body_style_convertible body_style_hardtop body_style_hatchback
1.000000 0.000000 0.000000
body_style_sedan body_style_wagon wheel_drive_4wd
0.000000 0.000000 0.000000
wheel_drive_fwd wheel_drive_rwd engine_location_front
0.000000 1.000000 1.000000
engine_location_rear engine_type_dohc engine_type_l
0.000000 1.000000 0.000000
engine_type_ohc engine_type_ohcf engine_type_ohcv
0.000000 0.000000 0.000000
engine_type_rotor fuel_system_1bbl fuel_system_2bbl
0.000000 0.000000 0.000000
fuel_system_4bbl fuel_system_idi fuel_system_mfi
0.000000 0.000000 0.000000
fuel_system_mpfi fuel_system_spdi fuel_system_spfi
1.000000 0.000000 0.000000
M_normalized_losses M_bore M_stroke
1.000000 0.000000 0.000000
M_horsepower M_peak_rpm y_remaining
0.000000 0.000000 9.711116
This makes no sense since I'm seeing the same thing and it works for me. Set debug_log = TRUE and find the java log file - it will be in your workspace or in the bartMachine folder. That should print the exact Java error.
On Wed, Aug 12, 2015 at 1:28 PM, theodds notifications@github.com wrote:
The input to addTrainingDataRow for the second observation similar to the first observation; at the point of error, I get: Browse[2]> as.character(model_matrix_training_data[1, ]) [1] "3" NA "2" "88.6"
[5] "168.8" "64.1" "48.8" "2548"
[9] "4" "130" "3.47" "2.68"
[13] "9" "111" "5000" "21"
[17] "27" "0" "1" "1"
[21] "0" "1" "0" "0"
[25] "0" "0" "0" "0"
[29] "1" "1" "0" "1"
[33] "0" "0" "0" "0"
[37] "0" "0" "0" "0"
[41] "0" "0" "1" "0"
[45] "0" "1" "0" "0"
[49] "0" "0" "9.5100745254521" Browse[2]> as.character(model_matrix_training_data[2, ]) [1] "3" NA "2" "88.6"
[5] "168.8" "64.1" "48.8" "2548"
[9] "4" "130" "3.47" "2.68"
[13] "9" "111" "5000" "21"
[17] "27" "0" "1" "1"
[21] "0" "1" "0" "0"
[25] "0" "0" "0" "0"
[29] "1" "1" "0" "1"
[33] "0" "0" "0" "0"
[37] "0" "0" "0" "0"
[41] "0" "0" "1" "0"
[45] "0" "1" "0" "0"
[49] "0" "0" "9.71111565988867"
The only thing weird I guess is that everything but NAs aren't converted to strings, but everything else is.
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130383842.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
Running with debug_log=TRUE creates two files in my workspace: unnamed.log and unnamed.log.lck, which are both empty files. Nothing new was created in the bartMachine folder.
Even stranger...
Can you look at https://cran.r-project.org/web/packages/rJava/rJava.pdf page 18 and figure out how to use .jcheck to return the actual exception to you?
My guess is the error is on line 80 of https://github.com/kapelner/bartMachine/blob/master/src/bartMachine/Classifier.java and I wonder why your setup has this.
It is possible your version of Java has something to do with it. Can you print out java -version for me? You may have to downgrade. I believe our jar is created with version 6. Can you check that too by looking at the jar inside of your bart_java.jar (....\R-3.0.2\library\bartMachine\java) look at http://stackoverflow.com/questions/3313532/what-version-of-javac-built-my-jar
On Wed, Aug 12, 2015 at 1:47 PM, theodds notifications@github.com wrote:
Running with debug_log=TRUE creates two files in my workspace: unnamed.log and unname.log.lck, which are both empty files. Nothing new was created in the bartMachine folder.
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130387896.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
Running java -version gives
java version "1.7.0_79"
OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)
OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)
Using the method of jikes.thunderbolt in the stackoverflow answer, I got that my jars correspond to "major version: 50", which apparently corresponds to Java 6.
Tried .jcheck(), it didn't seem to do anything. I will probably try to see if I can get this working on another machine. I'll also try compiling the .jar files from source from github again.
How about this... delete the bartMachine library folder. Then do a "git clone" on the source repository and then "ant" and then do an "R CMD INSTALL bartMachine"
On Wed, Aug 12, 2015 at 2:59 PM, theodds notifications@github.com wrote:
Running java -version gives
java version "1.7.0_79" OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1) OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)
Using the method of jikes.thunderbolt in the stackoverflow answer, I got that my jars correspond to "major version: 50", which apparently corresponds to Java 6.
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130412138.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
It is working now. What fixed it was uninstalling rJava
and instead installing it from sudo apt-get install r-cran-rjava
, and then installing bartMachine
it from the source. I guess installing rJava
using install.packages("rJava")
was the issue?
Thanks for your help!
It is possible you were using an old version of rJava... beats me. Glad it's fixed. And glad this thread is available online for all to see who have a similar problem. Have fun using bartMachine...
On Wed, Aug 12, 2015 at 3:16 PM, theodds notifications@github.com wrote:
It is working now. What fixed it was uninstalling rJava and instead installing it from sudo apt-get install r-cran-rjava, and then installing it from the source. I guess installing rJava using install.packages("rJava") was the issue?
— Reply to this email directly or view it on GitHub https://github.com/kapelner/bartMachine/issues/7#issuecomment-130416216.
Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)
Running the code directly from the vignette, I get the following error when attempting to fit the model with missing covariates.
Specifically, I ran
Particularly confusing, because I ran this code on old versions of the package as well (and got the same error), so I'm unsure whether this is a problem with the package or a problem with my install. For reference, this issue also appears here.