Closed exalate-issue-sync[bot] closed 1 year ago
Kevin Normoyle commented: I'm making this a nopass test tonite.
runit_GBM_groupsplit_czechboard_medium.R
there's enough other confusion in moving to new jenkins jobs for high throughput that we need all green, too hard to debug other infrastructure things if the tests fail intermittently
don't know if this was introduced today or what .(I think it was introduced earlier today)
can change it back tomorrow if you like
I noticed I did get a pass, then it failed with no pushes
so maybe it's something intermittent. Fails a lot now. (all the time?)
the other tests in the job don't
it's running on a machine with 128GB dram with a single cloud of 5 jvms of 20GB eac
Wiping output directory...
Wiping test state (including random seeds)...
Wiping output directory...
Wiping test state (including random seeds)...
Starting clouds...
Waiting for H2O nodes to come up...
H2O Cloud 0 Node 0 started with output file /home4/jenkins/slave_dir_from_mr-0xe4/workspace/h2o_master_DEV_runit_medium_large/h2o-r/tests/results/java_0_0.out.txt H2O Cloud 0 Node 1 started with output file /home4/jenkins/slave_dir_from_mr-0xe4/workspace/h2o_master_DEV_runit_medium_large/h2o-r/tests/results/java_0_1.out.txt H2O Cloud 0 Node 2 started with output file /home4/jenkins/slave_dir_from_mr-0xe4/workspace/h2o_master_DEV_runit_medium_large/h2o-r/tests/results/java_0_2.out.txt H2O Cloud 0 Node 3 started with output file /home4/jenkins/slave_dir_from_mr-0xe4/workspace/h2o_master_DEV_runit_medium_large/h2o-r/tests/results/java_0_3.out.txt H2O Cloud 0 Node 4 started with output file /home4/jenkins/slave_dir_from_mr-0xe4/workspace/h2o_master_DEV_runit_medium_large/h2o-r/tests/results/java_0_4.out.txt
Setting up R H2O package...
Starting 25 tests on 1 clouds with 5 total H2O nodes...
Arno Candel commented: Can you please check this again?
Sebastian Czarnota commented: This test was removed, adding it back in. It now passes
JIRA Issue Migration Info
Jira Issue: PUBDEV-347 Assignee: UNASSIGNED Reporter: Kevin Normoyle State: Resolved Fix Version: N/A Attachments: N/A Development PRs: N/A
I made the test NOPASS after this (happens repeatedly)
http://mr-0xe4:8080/job/h2o_master_DEV_runit_medium_large/1022/console http://mr-0xe4:8080/job/h2o_master_DEV_runit_medium_large/1022/artifact/h2o-r/tests/results/testdir_algos_gbm_runit_GBM_groupsplit_czechboard_medium.R.out.txt
pasted the R output below snippet:
[2015-02-03 00:05:32] [ERROR] : Error: Test failed: 'GBM Test: Classification with Checkerboard Group Split' Not expected: Unexpected HTTP Status code: 500 Internal Server Error (url = http://127.0.0.1:44000/LATEST/ModelMetrics.json/models/GBMModel__890883e5e1781e09e3b5dddcbe194aa8/frames/board.hex) 1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls) 2: eval(code, new_test_environment) 3: eval(expr, envir, enclos) 4: withWarnings(test(conn)) 5: withCallingHandlers(expr, warning = wHandler) 6: test(conn) 7: h2o.performance(drfmodel.nogrp, board.hex) 8: .h2o.remoteSend(model@conn, method = "POST", .h2o.MODEL_METRICS(model@key, data@key), .params = parms) 9: .h2o.fromJSON(.h2o.doSafeREST(conn = conn, urlSuffix = page, parms = .params, method = method)) 10: processMatrices(fromJSON(txt, ...)) 11: fromJSON(txt, ...) 12: .h2o.doSafeREST(conn = conn, urlSuffix = page, parms = .params, method = method) 13: stop(sprintf("Unexpected HTTP Status code: %d %s (url = %s)", rv$httpStatusCode, rv$httpStatusMessage, rv$url)) 14: .handleSimpleError(function (e) { e$calls <- head(sys.calls()[-seq_len(frame + 7)], -2) signalCondition(e) }, "Unexpected HTTP Status code: 500 Internal Server Error (url = http://127.0.0.1:44000/LATEST/ModelMetrics.json/models/GBMModel__890883e5e1781e09e3b5dddcbe194aa8/frames/board.hex)", quote(.h2o.doSafeREST(conn = conn, urlSuffix = page, parms = .params, method = method))).
SUMMARY OF RESULTS
Total tests: 25 Passed: 10 Did not pass: 15 Did not complete: 0 Tolerated NOPASS: 14
Total time: 1584.36 sec Time/completed test: 63.37 sec
True fail list: runit_GBM_groupsplit_czechboard_medium.R
Build step 'Execute shell' marked build as failure Archiving artifacts Recording fingerprints Extended Email Publisher is currently disabled in project settings Finished: FAILURE
R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.
Attaching package: ‘R.oo’
The following objects are masked from ‘package:methods’:
The following objects are masked from ‘package:base’:
R.utils v1.34.0 (2014-10-07) successfully loaded. See ?R.utils for help.
Attaching package: ‘R.utils’
The following object is masked from ‘package:utils’:
The following objects are masked from ‘package:base’:
[2015-02-03 00:04:58] [INFO]: ============== Setting up R-Unit environment... ================
[2015-02-03 00:04:58] [INFO]: Check that H2O R package matches version on server
Loading required package: RCurl Loading required package: bitops
Attaching package: ‘RCurl’
The following object is masked from ‘package:R.utils’:
The following object is masked from ‘package:R.oo’:
Loading required package: rjson Loading required package: tools Loading required package: statmod
Your next step is to start H2O and get a connection object (named 'localH2O', for example):
For H2O package documentation, ask for help:
After starting H2O, you can use the Web UI at http://localhost:54321 For more information visit http://docs.0xdata.com
Successfully connected to http://127.0.0.1:44000/
R is connected to H2O cluster: H2O cluster uptime: 22 minutes 50 seconds H2O cluster version: 0.1.23.99999 H2O cluster name: H2O_runit_jenkins_9296245 H2O cluster total nodes: 5 H2O cluster total memory: 88.89 GB H2O cluster total cores: 120 H2O cluster allowed cores: 120 H2O cluster healthy: TRUE
[2015-02-03 00:05:00] [INFO]: Checking Package dependencies for this test.
[2015-02-03 00:05:00] [INFO]: Loading RUnit and testthat and R.utils
Loading required package: RUnit Loading required package: testthat
Attaching package: ‘testthat’
The following object is masked from ‘package:R.oo’:
[2015-02-03 00:05:00] [INFO]: Loading other required test packages Loading required package: glmnet Loading required package: Matrix Loaded glmnet 1.9-8
Loading required package: gbm Loading required package: survival Loading required package: splines Loading required package: lattice Loading required package: parallel Loaded gbm 2.1 Loading required package: ROCR Loading required package: gplots
Attaching package: ‘gplots’
The following object is masked from ‘package:stats’:
[INFO]: Using SEED: 1959759076
[SEED] : 1959759076
Appending REST API transactions to log file ./Rsandbox_runit_GBM_groupsplit_czechboard_medium.R/rest.log
[2015-02-03 00:05:05] [INFO]: Importing czechboard_300x300.csv data...
[2015-02-03 00:05:07] [INFO]: Summary of czechboard_300x300.csv from H2O:
C1 C2 C3
a:300 a:300 0 :45000 b:300 b:300 1 :45000 c:300 c:300
d:300 d:300
e:300 e:300
f:300 f:300
[2015-02-03 00:05:08] [INFO]: H2O GBM (Naive Split) with parameters: ntrees = 50, max_depth = 20, nbins = 500
H2OBinomialModel: gbm
Model Details:
Mean Square Error for Train Frame [1] 0.2500000 0.2493672 0.2488946 0.2486529 0.2484250 0.2482345 0.2480668 [8] 0.2478935 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 [15] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.2456820 0.0000000 [22] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 [29] 0.0000000 0.0000000 0.0000000 0.2433153 0.0000000 0.0000000 0.0000000 [36] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 [43] 0.0000000 0.2411938 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 [50] 0.0000000 0.2400036
Mean Square Error for Validation Frame [,1] [1,] "NaN" [2,] "NaN" [3,] "NaN" [4,] "NaN" [5,] "NaN" [6,] "NaN" [7,] "NaN" [8,] "NaN" [9,] "0"
[10,] "0"
[11,] "0"
[12,] "0"
[13,] "0"
[14,] "0"
[15,] "0"
[16,] "0"
[17,] "0"
[18,] "0"
[19,] "0"
[20,] "NaN" [21,] "0"
[22,] "0"
[23,] "0"
[24,] "0"
[25,] "0"
[26,] "0"
[27,] "0"
[28,] "0"
[29,] "0"
[30,] "0"
[31,] "0"
[32,] "NaN" [33,] "0"
[34,] "0"
[35,] "0"
[36,] "0"
[37,] "0"
[38,] "0"
[39,] "0"
[40,] "0"
[41,] "0"
[42,] "0"
[43,] "0"
[44,] "NaN" [45,] "0"
[46,] "0"
[47,] "0"
[48,] "0"
[49,] "0"
[50,] "0"
[51,] "NaN"
######## ### #### ##
[2015-02-03 00:05:32] [ERROR] : Error: Test failed: 'GBM Test: Classification with Checkerboard Group Split' Not expected: Unexpected HTTP Status code: 500 Internal Server Error (url = http://127.0.0.1:44000/LATEST/ModelMetrics.json/models/GBMModel__890883e5e1781e09e3b5dddcbe194aa8/frames/board.hex) 1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls) 2: eval(code, new_test_environment) 3: eval(expr, envir, enclos) 4: withWarnings(test(conn)) 5: withCallingHandlers(expr, warning = wHandler) 6: test(conn) 7: h2o.performance(drfmodel.nogrp, board.hex) 8: .h2o.remoteSend(model@conn, method = "POST", .h2o.MODEL_METRICS(model@key, data@key), .params = parms) 9: .h2o.fromJSON(.h2o.doSafeREST(conn = conn, urlSuffix = page, parms = .params, method = method)) 10: processMatrices(fromJSON(txt, ...)) 11: fromJSON(txt, ...) 12: .h2o.doSafeREST(conn = conn, urlSuffix = page, parms = .params, method = method) 13: stop(sprintf("Unexpected HTTP Status code: %d %s (url = %s)", rv$httpStatusCode, rv$httpStatusMessage, rv$url)) 14: .handleSimpleError(function (e) { e$calls <- head(sys.calls()[-seq_len(frame + 7)], -2) signalCondition(e) }, "Unexpected HTTP Status code: 500 Internal Server Error (url = http://127.0.0.1:44000/LATEST/ModelMetrics.json/models/GBMModel__890883e5e1781e09e3b5dddcbe194aa8/frames/board.hex)", quote(.h2o.doSafeREST(conn = conn, urlSuffix = page, parms = .params, method = method))).
SEED used: 1959759076
[2015-02-03 00:05:32] [ERROR] : TEST FAILED No traceback available