ClimbsRocks / machineJS

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
https://github.com/ClimbsRocks/auto_ml
408 stars 64 forks source link

Issue after data-formatter #154

Closed kprimice closed 8 years ago

kprimice commented 8 years ago

I wanted to try machineJS but I wasn't able to make it work at all. I can't understand why I keep getting this errors :

thanks for inviting us along on your machine learning journey!

message from Python: finished concatting the training and testing files together message from Python: finished joining the data message from Python: finished removing non-unique categorical values message from Python: finished imputing missing values message from Python: finished grouping by ID if relevant message from Python: finished vectorizing the categorical values message from Python: here are the features that were kept, sorted by their feature importance message from Python: [ [ 'amp', 0.1436 ], ... ... ... [ 't27', 0.0122 ] ] message from Python: total time for the random forest part of feature selection, in minutes: message from Python: 0 message from Python: finished running feature selecting message from Python: successfully turned y into a sparse matrix! message from Python: we have written your fully transformed data to a folder at: message from Python: /opt/machineJS/pySetup/data-formatterResults heard an error! { [Error: /usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2645: VisibleDeprecationWarning: rank is deprecated; use the ndim attribute or function instead. To find the rank of a matrix see numpy.linalg.matrix_rank. VisibleDeprecationWarning) ] executable: 'python', options: null, script: '/opt/machineJS/node_modules/data-formatter/mainPythonProcess.py', args: [ '{"trainingData":"jstrain.csv","testingData":"jstest.csv","trainingPrettyName":"jstrain","testingPrettyName":"jstest","joinFileName":"","on":false,"allFeatureCombinations":false,"keepAllFeatures":false,"outputFolder":"/opt/machin eJS/pySetup/data-formatterResults","test":false,"verbose":1,"join":false}' ], exitCode: 0 } Here are the fileNames from data-formatter. If you want to skip the data-formatter part next time you want to play with this dataset, copy and paste this object into machineJS/pySetup/testingFileNames.js, following the instructions include d in that file. { idHeader: 'id', outputHeader: 'target', id_train: '/opt/machineJS/pySetup/data-formatterResults/id_train_jstrain.npz', id_test: '/opt/machineJS/pySetup/data-formatterResults/id_test_jstestjstrain.npz', y_train: '/opt/machineJS/pySetup/data-formatterResults/y_train_jstrain.npz', validation_split_column: '/opt/machineJS/pySetup/data-formatterResults/validation_split_column_jstrain.npz', hasCustomValidationSplit: false, X_test: '/opt/machineJS/pySetup/data-formatterResults/X_test_jstestjstrain.npz', X_train: '/opt/machineJS/pySetup/data-formatterResults/X_train_jstrain.npz', X_train_nn: '/opt/machineJS/pySetup/data-formatterResults/X_train_nn_jstrain.npz', y_train_nn: '/opt/machineJS/pySetup/data-formatterResults/y_train_nn_jstrain.npz', X_test_nn: '/opt/machineJS/pySetup/data-formatterResults/X_test_nn_jstestjstrain.npz', testingDataLength: 4619, trainingDataLength: 3181, problemType: 'multi-category' } { [Error: /usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:43: DeprecationWarning: This module has been deprecated in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning) /usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py:43: DeprecationWarning: This module has been deprecated in favor of the model_selection module into which all the refactored classes and functions are moved. This module will be removed in 0.20. DeprecationWarning) /usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2645: VisibleDeprecationWarning: rank is deprecated; use the ndim attribute or function instead. To find the rank of a matrix see numpy.linalg.matrix_rank. VisibleDeprecationWarning) Traceback (most recent call last): File "/opt/machineJS/pySetup/training.py", line 177, in X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=testSize, random_state=0) File "/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 1918, in train_test_split safe_indexing(a, test)) for a in arrays)) File "/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 1918, in safe_indexing(a, test)) for a in arrays)) File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/init.py", line 112, in safe_indexing return X[indices] File "/usr/lib/python2.7/dist-packages/scipy/sparse/csr.py", line 256, in getitem P = extractor(row, self.shape[0]) # [[1,2],j] or [[1,2],1:2] File "/usr/lib/python2.7/dist-packages/scipy/sparse/csr.py", line 214, in extractor (min_indx,max_indx) = check_bounds(indices,N) File "/usr/lib/python2.7/dist-packages/scipy/sparse/csr.py", line 198, in check_bounds max_indx = indices.max() File "/usr/local/lib/python2.7/dist-packages/numpy/core/_methods.py", line 26, in _amax return umrmaximum(a, axis, None, out, keepdims) ValueError: zero-size array to reduction operation maximum which has no identity ] executable: 'python', options: null, script: '/opt/machineJS/pySetup/training.py', args: [ '/opt/machineJS/jstrain.csv', '{"":["/opt/machineJS/machineJS.js","jstrain.csv"],"predict":"jstest.csv","dev":false,"computerTotalCPUs":8,"machineJSLocation":"/opt/machineJS","dataFile":"jstrain.csv","dataFileName":"jstrain.csv","dataFilePretty":"jstrain","binary Output":false,"outputFileName":"jstrain","join":"","on":"","allFeatureCombinations":"","keepAllFeatures":"","dfOutputFolder":"/opt/machineJS/pySetup/data-formatterResults","matrixOutput":"","testFileName":"jstest.csv","testFilePretty":"jst est","testOutputFileName":"jstest","searchPercent":0.3,"validationPercent":0.3,"numRounds":10,"numIterationsPerRound":10,"predictionsFolder":"/opt/machineJS/predictions/jstest","validationFolder":"/opt/machineJS/predictions/jstest/validati on","bestClassifiersFolder":"/opt/machineJS/pySetup/bestClassifiers/jstrain","ensemblerOutputFolder":"/opt/machineJS","validationRound":false,"ensemblerArgs":{"inputFolder":"/opt/machineJS/predictions/jstest","outputFolder":"/opt/machineJS ","validationFolder":"/opt/machineJS/predictions/jstest/validation","fileNameIdentifier":"jstrain","validationRound":true},"numCPUs":5,"longTrainThreshold":0.97,"continueToTrainThreshold":0.97,"alreadyFormatted":false,"fileNames":{"idHeade r":"id","outputHeader":"target","id_train":"/opt/machineJS/pySetup/data-formatterResults/id_train_jstrain.npz","id_test":"/opt/machineJS/pySetup/data-formatterResults/id_test_jstestjstrain.npz","y_train":"/opt/machineJS/pySetup/data-format terResults/y_train_jstrain.npz","validation_split_column":"/opt/machineJS/pySetup/data-formatterResults/validation_split_column_jstrain.npz","hasCustomValidationSplit":false,"X_test":"/opt/machineJS/pySetup/data-formatterResults/X_test_jst estjstrain.npz","X_train":"/opt/machineJS/pySetup/data-formatterResults/X_train_jstrain.npz","X_train_nn":"/opt/machineJS/pySetup/data-formatterResults/X_train_nn_jstrain.npz","y_train_nn":"/opt/machineJS/pySetup/data-formatterResults/y_tr ain_nn_jstrain.npz","X_test_nn":"/opt/machineJS/pySetup/data-formatterResults/X_test_nn_jstestjstrain.npz","testingDataLength":4619,"trainingDataLength":3181,"problemType":"multi-category","X_traintrainingData":"/opt/machineJS/pySetup/data -formatterResults/X_train_jstraintrainingData.npz","X_trainvalidationData":"/opt/machineJS/pySetup/data-formatterResults/X_train_jstrainvalidationData.npz","id_traintrainingData":"/opt/machineJS/pySetup/data-formatterResults/id_train_jstra intrainingData.npz","id_trainvalidationData":"/opt/machineJS/pySetup/data-formatterResults/id_train_jstrainvalidationData.npz","y_trainvalidationData":"/opt/machineJS/pySetup/data-formatterResults/y_train_jstrainvalidationData.npz","y_trai ntrainingData":"/opt/machineJS/pySetup/data-formatterResults/y_train_jstraintrainingData.npz","X_train_nntrainingData":"/opt/machineJS/pySetup/data-formatterResults/X_train_nn_jstraintrainingData.npz","X_train_nnvalidationData":"/opt/machi neJS/pySetup/data-formatterResults/X_train_nn_jstrainvalidationData.npz","y_train_nntrainingData":"/opt/machineJS/pySetup/data-formatterResults/y_train_nn_jstraintrainingData.npz","y_train_nnvalidationData":"/opt/machineJS/pySetup/data-for matterResults/y_train_nn_jstrainvalidationData.npz"}}', '{"idHeader":"id","outputHeader":"target","id_train":"/opt/machineJS/pySetup/data-formatterResults/id_train_jstrain.npz","id_test":"/opt/machineJS/pySetup/data-formatterResults/id_test_jstestjstrain.npz","y_train":"/opt/machineJS/pySe tup/data-formatterResults/y_train_jstrain.npz","validation_split_column":"/opt/machineJS/pySetup/data-formatterResults/validation_split_column_jstrain.npz","hasCustomValidationSplit":false,"X_test":"/opt/machineJS/pySetup/data-formatterRes ults/X_test_jstestjstrain.npz","X_train":"/opt/machineJS/pySetup/data-formatterResults/X_train_jstrain.npz","X_train_nn":"/opt/machineJS/pySetup/data-formatterResults/X_train_nn_jstrain.npz","y_train_nn":"/opt/machineJS/pySetup/data-format terResults/y_train_nn_jstrain.npz","X_test_nn":"/opt/machineJS/pySetup/data-formatterResults/X_test_nn_jstestjstrain.npz","testingDataLength":4619,"trainingDataLength":3181,"problemType":"multi-category","X_traintrainingData":"/opt/machine JS/pySetup/data-formatterResults/X_train_jstraintrainingData.npz","X_trainvalidationData":"/opt/machineJS/pySetup/data-formatterResults/X_train_jstrainvalidationData.npz","id_traintrainingData":"/opt/machineJS/pySetup/data-formatterResults /id_train_jstraintrainingData.npz","id_trainvalidationData":"/opt/machineJS/pySetup/data-formatterResults/id_train_jstrainvalidationData.npz","y_trainvalidationData":"/opt/machineJS/pySetup/data-formatterResults/y_train_jstrainvalidationDa ta.npz","y_traintrainingData":"/opt/machineJS/pySetup/data-formatterResults/y_train_jstraintrainingData.npz","X_train_nntrainingData":"/opt/machineJS/pySetup/data-formatterResults/X_train_nn_jstraintrainingData.npz","X_train_nnvalidationDa ta":"/opt/machineJS/pySetup/data-formatterResults/X_train_nn_jstrainvalidationData.npz","y_train_nntrainingData":"/opt/machineJS/pySetup/data-formatterResults/y_train_nn_jstraintrainingData.npz","y_train_nnvalidationData":"/opt/machineJS/p ySetup/data-formatterResults/y_train_nn_jstrainvalidationData.npz"}', 'clRfGini', 'multi-category', 0 ], exitCode: 1 } kicking off the process of making predictions on the predicting data set for: clRfGini we heard an unexpected shutdown event that is causing everything to close /opt/machineJS/shutDown.js:19 throw error; ^

TypeError: Cannot read property 'longTrainScore' of undefined at startPredictionsScript (/opt/machineJS/pySetup/utils.js:129:58) at Object.module.exports.makePredictions (/opt/machineJS/pySetup/utils.js:144:5) at Object.module.exports.makePredictions (/opt/machineJS/pySetup/controllerPython.js:142:11) at /opt/machineJS/pySetup/controllerPython.js:32:24 at emitFinishedTrainingCallback (/opt/machineJS/pySetup/utils.js:87:7) at /opt/machineJS/pySetup/utilsPyShell.js:60:7 at null._endCallback (/opt/machineJS/node_modules/python-shell/index.js:148:25) at ChildProcess. (/opt/machineJS/node_modules/python-shell/index.js:99:35) at emitTwo (events.js:100:13) at ChildProcess.emit (events.js:185:7)

kprimice commented 8 years ago

Problem solved: my scipy wasn't up to date...

ClimbsRocks commented 8 years ago

Thanks for letting me know!