ClimbsRocks / machineJS

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
https://github.com/ClimbsRocks/auto_ml
408 stars 64 forks source link

Cannot find pySetup/utils.py #149

Open eanie opened 8 years ago

eanie commented 8 years ago

Whenever I try and run anything (machineJs or tests), I get the following:

sospan-2:machineJS justin$ npm run test:regression

machinejs@0.9.1 test:regression /Users/justin/DeepLearning/machineJS/machineJS mocha test/regression/test.js

regression problems module.js:339 throw err; ^

Error: Cannot find module 'pySetup/utils.js' at Function.Module._resolveFilename (module.js:337:15) at Function.Module._load (module.js:287:25) at Module.require (module.js:366:17) at require (module.js:385:17) at Object. (/Users/justin/DeepLearning/machineJS/machineJS/node_modules/ensembler/node_modules/machinejs/processArgs.js:3:13) at Module._compile (module.js:435:26) at Object.Module._extensions..js (module.js:442:10) at Module.load (module.js:356:32) at Function.Module._load (module.js:311:12) at Module.require (module.js:366:17) at require (module.js:385:17) 1) "before all" hook

0 passing (326ms) 1 failing

1) regression problems "before all" hook: Command failed: node machineJS.js /Users/justin/DeepLearning/machineJS/machineJS/node_modules/data-for-tests/rossman/tinyTrain.csv --predict /Users/justin/DeepLearning/machineJS/machineJS/node_modules/data-for-tests/rossman/test.csv --join /Users/justin/DeepLearning/machineJS/machineJS/node_modules/data-for-tests/rossman/store.csv --dfOutputFolder /Users/justin/DeepLearning/machineJS/machineJS/test/regression/dfTestResults --predictionsFolder /Users/justin/DeepLearning/machineJS/machineJS/test/regression/rTestPredictions --ensemblerOutputFolder /Users/justin/DeepLearning/machineJS/machineJS/test/regression --bestClassifiersFolder /Users/justin/DeepLearning/machineJS/machineJS/test/regression/bestClassifiersTest module.js:339 throw err; ^

I'm probably doing something stupid during the install, but I've been banging my head against a wall for the past 24 hours. Any ideas why it is not picking up the pySetup modules?

ClimbsRocks commented 8 years ago

Oh man, gotta love progress. The new version of npm installs all dependencies in a single flat folder, rather than nesting them as folders within each individual sub-dependency. Better overall, but it apparently screws things up when you have circular dependencies like here.

I should have a fix out this morning. Thanks for reporting this!

ClimbsRocks commented 8 years ago

This ended up being a combination of two npm issues: the new file structure referenced above, and then npm installing not the latest versions of ensembler and data-formatter. The latest versions should have this handled, so I forced npm to install the latest versions of these pacakges, rather than hardcoding in version numbers.

Let me know if you run into any other issues! I'm running through everything this morning to make sure it's all compatible with this fix.

Thanks again for reporting this! I'd love any other feedback/experiences you have.

eanie commented 8 years ago

Thanks for taking a look so quickly. Although just retried the install from scratch and rerunning the test. I see this in the npm debug again:

0 info it worked if it ends with ok 1 verbose cli [ '/usr/local/bin/node', 1 verbose cli '/usr/local/bin/npm', 1 verbose cli 'run', 1 verbose cli 'test:regression' ] 2 info using npm@2.14.12 3 info using node@v4.3.1 4 verbose run-script [ 'pretest:regression', 4 verbose run-script 'test:regression', 4 verbose run-script 'posttest:regression' ] 5 info pretest:regression machinejs@0.9.4 6 info test:regression machinejs@0.9.4 7 verbose unsafe-perm in lifecycle true 8 info machinejs@0.9.4 Failed to exec test:regression script 9 verbose stack Error: machinejs@0.9.4 test:regression: mocha test/regression/test.js 9 verbose stack Exit status 1 9 verbose stack at EventEmitter. (/usr/local/lib/node_modules/npm/lib/utils/lifecycle.js:214:16) 9 verbose stack at emitTwo (events.js:87:13) 9 verbose stack at EventEmitter.emit (events.js:172:7) 9 verbose stack at ChildProcess. (/usr/local/lib/node_modules/npm/lib/utils/spawn.js:24:14) 9 verbose stack at emitTwo (events.js:87:13) 9 verbose stack at ChildProcess.emit (events.js:172:7) 9 verbose stack at maybeClose (internal/child_process.js:821:16) 9 verbose stack at Process.ChildProcess._handle.onexit (internal/child_process.js:211:5) 10 verbose pkgid machinejs@0.9.4 11 verbose cwd /Users/justin/DeepLearning/machineJS/machineJS 12 error Darwin 14.5.0 13 error argv "/usr/local/bin/node" "/usr/local/bin/npm" "run" "test:regression" 14 error node v4.3.1 15 error npm v2.14.12 16 error code ELIFECYCLE 17 error machinejs@0.9.4 test:regression: mocha test/regression/test.js 17 error Exit status 1 18 error Failed at the machinejs@0.9.4 test:regression script 'mocha test/regression/test.js'. 18 error This is most likely a problem with the machinejs package, 18 error not with npm itself. 18 error Tell the author that this fails on your system: 18 error mocha test/regression/test.js 18 error You can get their info via: 18 error npm owner ls machinejs 18 error There is likely additional logging output above. 19 verbose exit [ 1, true ]

DanielAndreasen commented 8 years ago

I have the same error, and I just downloaded (git clone ....) less than 30min ago. The error I get is the Error: Cannot find module 'pySetup/utils.js'

ClimbsRocks commented 8 years ago

Thanks for the feedback!

@eanie: it looks like for some reason your install is attempting to run the test suite. I haven't refactored the test suite to match my last big code refactor, so the test will fail.

One of the benefits of being the sole author is that I know every nook and cranny of this project so well I can get away with that at the moment (though it's at the top of my list to fix up!). I'll go through today and comment out all the broken tests. In the meantime, you should still be able to run the program just fine!

@DanielAndreasen: I'll look into that right now! Do you mind telling me which version of npm/node you're using, as well as the full error log? There are two places that message could be coming from.

alper-t commented 8 years ago

I have the same issue.

module.js:341
    throw err;
    ^

Error: Cannot find module 'pySetup\utils.js'
    at Function.Module._resolveFilename (module.js:339:15)
    at Function.Module._load (module.js:290:25)
    at Module.require (module.js:367:17)
    at require (internal/module.js:16:19)
    at Object.<anonymous> (C:\@projects\pycoder\machinejs\machineJS\processArgs.js:3:13)
    at Module._compile (module.js:413:34)
    at Object.Module._extensions..js (module.js:422:10)
    at Module.load (module.js:357:32)
    at Function.Module._load (module.js:314:12)
    at Module.require (module.js:367:17)

My npm version:

npm@1.4.9
DanielAndreasen commented 8 years ago

Sure.

$ npm -V
npm@2.11.2 /usr/local/lib/node_modules/npm

And the error message

$ node machineJS.js node_modules/data-for-tests/rossman/train.csv --predict node_modules/data-for-tests/rossman/test.csv
module.js:338
throw err;
      ^
Error: Cannot find module 'pySetup/utils.js'
    at Function.Module._resolveFilename (module.js:336:15)
    at Function.Module._load (module.js:278:25)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/home/daniel/GIT/machineJS/processArgs.js:3:13)
    at Module._compile (module.js:460:26)
    at Object.Module._extensions..js (module.js:478:10)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
   at Module.require (module.js:365:17)
eanie commented 8 years ago

Thanks for the heads up. I'll give it a whirl and let you know. Good work on this, I started on Fann, my current system is on encog, so I'm interested to see how this goes

ClimbsRocks commented 8 years ago

Alright, let me know if it works for you now, @DanielAndreasen! I haven't published to npm yet, but the latest commit should be there if you clone again.

ClimbsRocks commented 8 years ago

@eanie: I'd love any other feedback you have as you work with this!

DanielAndreasen commented 8 years ago

Okay, did a regular git pull followed by npm install (don't know if that is necessary).

This time it went further, but here is the output of the run

thanks for inviting us along on your machine learning journey!

heard an error!
{ [Error: AttributeError: 'module' object has no attribute 'MaxAbsScaler']
  traceback: 'Traceback (most recent call last):\n  File "/home/daniel/GIT/machineJS/node_modules/data-formatter/mainPythonProcess.py", line 27, in <module>\n    from helperFunctions import minMax\n  File "/home/daniel/GIT/machineJS/node_modules/data-formatter/helperFunctions/minMax.py", line 8, in <module>\n    max_abs_scaler = preprocessing.MaxAbsScaler()\nAttributeError: \'module\' object has no attribute \'MaxAbsScaler\'\n',
  executable: 'python',
  options: null,
  script: '/home/daniel/GIT/machineJS/node_modules/data-formatter/mainPythonProcess.py',
  args: [ '{"trainingData":"node_modules/data-for-tests/rossman/train.csv","testingData":"node_modules/data-for-tests/rossman/test.csv","trainingPrettyName":"rossmantrain","testingPrettyName":"rossmantest","joinFileName":"","on":false,"allFeatureCombinations":false,"keepAllFeatures":false,"outputFolder":"/home/daniel/GIT/machineJS/pySetup/data-formatterResults","test":false,"verbose":1,"join":false}' ],
  exitCode: 1 }
Here are the fileNames from data-formatter. If you want to skip the data-formatter part next time you want to play with this dataset, copy and paste this object into machineJS/pySetup/testingFileNames.js, following the instructions included in that file.
{}
{ [Error: KeyError: 'X_train']
  traceback: 'Traceback (most recent call last):\n  File "/home/daniel/GIT/machineJS/pySetup/splitDatasets.py", line 17, in <module>\n    XFileName = fileNames[\'X_train\']\nKeyError: \'X_train\'\n',
  executable: 'python',
  options: null,
  script: '/home/daniel/GIT/machineJS/pySetup/splitDatasets.py',
  args: 
   [ '/home/daniel/GIT/machineJS/ignoreMe.csv',
     '{"_":["/home/daniel/GIT/machineJS/machineJS.js","node_modules/data-for-tests/rossman/train.csv"],"predict":"node_modules/data-for-tests/rossman/test.csv","dev":false,"computerTotalCPUs":4,"machineJSLocation":"/home/daniel/GIT/machineJS","dataFile":"node_modules/data-for-tests/rossman/train.csv","dataFileName":"train.csv","dataFilePretty":"train","binaryOutput":false,"outputFileName":"rossmantrain","join":"","on":"","allFeatureCombinations":"","keepAllFeatures":"","dfOutputFolder":"/home/daniel/GIT/machineJS/pySetup/data-formatterResults","matrixOutput":"","testFileName":"test.csv","testFilePretty":"test","testOutputFileName":"rossmantest","searchPercent":0.3,"validationPercent":0.3,"numRounds":10,"numIterationsPerRound":10,"predictionsFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest","validationFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest/validation","bestClassifiersFolder":"/home/daniel/GIT/machineJS/pySetup/bestClassifiers/rossmantrain","ensemblerOutputFolder":"/home/daniel/GIT/machineJS","validationRound":false,"ensemblerArgs":{"inputFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest","outputFolder":"/home/daniel/GIT/machineJS","validationFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest/validation","fileNameIdentifier":"rossmantrain","validationRound":true},"numCPUs":3,"longTrainThreshold":0.97,"continueToTrainThreshold":0.97,"alreadyFormatted":false}',
     '{}' ],
  exitCode: 1 }
{ [Error: ImportError: cannot import name MLPClassifier]
  traceback: 'Traceback (most recent call last):\n  File "/home/daniel/GIT/machineJS/pySetup/training.py", line 28, in <module>\n    from makeClassifiers import makeClassifiers\n  File "/home/daniel/GIT/machineJS/pySetup/makeClassifiers.py", line 9, in <module>\n    from sklearn.neural_network import MLPClassifier\nImportError: cannot import name MLPClassifier\n',
  executable: 'python',
  options: null,
  script: '/home/daniel/GIT/machineJS/pySetup/training.py',
  args: 
   [ '/home/daniel/GIT/machineJS/node_modules/data-for-tests/rossman/train.csv',
     '{"_":["/home/daniel/GIT/machineJS/machineJS.js","node_modules/data-for-tests/rossman/train.csv"],"predict":"node_modules/data-for-tests/rossman/test.csv","dev":false,"computerTotalCPUs":4,"machineJSLocation":"/home/daniel/GIT/machineJS","dataFile":"node_modules/data-for-tests/rossman/train.csv","dataFileName":"train.csv","dataFilePretty":"train","binaryOutput":false,"outputFileName":"rossmantrain","join":"","on":"","allFeatureCombinations":"","keepAllFeatures":"","dfOutputFolder":"/home/daniel/GIT/machineJS/pySetup/data-formatterResults","matrixOutput":"","testFileName":"test.csv","testFilePretty":"test","testOutputFileName":"rossmantest","searchPercent":0.3,"validationPercent":0.3,"numRounds":10,"numIterationsPerRound":10,"predictionsFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest","validationFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest/validation","bestClassifiersFolder":"/home/daniel/GIT/machineJS/pySetup/bestClassifiers/rossmantrain","ensemblerOutputFolder":"/home/daniel/GIT/machineJS","validationRound":false,"ensemblerArgs":{"inputFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest","outputFolder":"/home/daniel/GIT/machineJS","validationFolder":"/home/daniel/GIT/machineJS/predictions/rossmantest/validation","fileNameIdentifier":"rossmantrain","validationRound":true},"numCPUs":3,"longTrainThreshold":0.97,"continueToTrainThreshold":0.97,"alreadyFormatted":false}',
     '{}',
     'clXGBoost',
     undefined,
     0 ],
  exitCode: 1 }
kicking off the process of making predictions on the predicting data set for: clXGBoost
we heard an unexpected shutdown event that is causing everything to close
/home/daniel/GIT/machineJS/shutDown.js:19
      throw error;
            ^
TypeError: Cannot read property 'longTrainScore' of undefined
    at startPredictionsScript (/home/daniel/GIT/machineJS/pySetup/utils.js:129:58)
    at Object.module.exports.makePredictions (/home/daniel/GIT/machineJS/pySetup/utils.js:144:5)
    at Object.module.exports.makePredictions (/home/daniel/GIT/machineJS/pySetup/controllerPython.js:142:11)
    at /home/daniel/GIT/machineJS/pySetup/controllerPython.js:32:24
    at emitFinishedTrainingCallback (/home/daniel/GIT/machineJS/pySetup/utils.js:87:7)
    at /home/daniel/GIT/machineJS/pySetup/utilsPyShell.js:60:7
    at null._endCallback (/home/daniel/GIT/machineJS/node_modules/python-shell/index.js:148:25)
    at ChildProcess.<anonymous> (/home/daniel/GIT/machineJS/node_modules/python-shell/index.js:99:35)
    at ChildProcess.emit (events.js:110:17)
    at Process.ChildProcess._handle.onexit (child_process.js:1074:12)
ClimbsRocks commented 8 years ago

Yay, progress! Did you run ./installPythonDependencies.sh? It looks like the machine isn't recognizing the MaxAbsScaler module we're importing from scikit-learn.

eanie commented 8 years ago

Same thing I had just now. So yes, progress

On 29 Feb 2016, at 17:11, Preston Parry notifications@github.com wrote:

Yay, progress! Did you run ./installPythonDependencies.sh? It looks like the machine isn't recognizing the MaxAbsScaler module we're importing from scikit-learn.

— Reply to this email directly or view it on GitHub.

DanielAndreasen commented 8 years ago

Okay, it is running now, and I'm pretty sure what the problem is. In your script for installing dependencies, scikit-learn is installed as super-user, but I have an anaconda environment installed, and another scikit-learn within anaconda. When you use sudo python, it is the system python, and not the anaconda python. So, with your script, the right scikit-learn is installed on the system python, but when I use machineJS, it is using anaconda's python. I had this problem before, as you may have guessed.

So, to resolve this, I simply installed scikit-learn myself.

git clone https://github.com/scikit-learn/scikit-learn
cd scikit-learn
python setup.py build
python setup.py install  # Note, without sudo in front!

I guess you have something similar @eanie.

DanielAndreasen commented 8 years ago

Just to conclude, I let it run (it takes quite some time, so I did it over night - not sure how long time it actually took), and it finished.

I think it will be better with a requirements.txt for installing dependencies: pip install -r requirements.txt. You can see examples of this in all major Python projects.

eanie commented 8 years ago

Is there an update on this? I'm not using anaconda, and have re-pulled, and getting the following:

{ [Error: ImportError: cannot import name MLPClassifier] traceback: 'Traceback (most recent call last):\n File "/Users/justin/DeepLearning/machineJS/pySetup/training.py", line 28, in \n from makeClassifiers import makeClassifiers\n File "/Users/justin/DeepLearning/machineJS/pySetup/makeClassifiers.py", line 9, in \n from sklearn.neural_network import MLPClassifier\nImportError: cannot import name MLPClassifier\n', executable: 'python', options: null, script: '/Users/justin/DeepLearning/machineJS/pySetup/training.py',

TypeError: Cannot read property 'longTrainScore' of undefined

DanielAndreasen commented 8 years ago

@eanie how did you install scikit-learn? Some of the problems for me were solved when I cloned their repository (see above). You might have to use sudo python setup.py install in the last step if you don't use anaconda.

eanie commented 8 years ago

Wow, total newbie error! I should have check the version of scikit-learn that was in the system location. It was way too old. Running my first test now. Thanks for all the help on this guys

DanielAndreasen commented 8 years ago

I think the test will take a long time! But you are welcome :)

ClimbsRocks commented 8 years ago

Thanks for all the help here everyone!

Just to wrap a few things up: Yes, we are importing the MLP from the current development version of scikit-learn, so it needs to be installed from GitHub until they release v0.18 in the next few months.

@DanielAndreasen: I will be redoing the install process soon based on your experience and recommendations. You've been very helpful, thanks!

This is designed to take a while to run. It's searching hundreds of combinations of hyperparameters for a half dozen or so different algorithms, then using more machine learning at the end to make sense of all these results.

If you want to make this run more quickly, bump down numRounds and numIterationsPerRound in processArgs.js.

Another thing you can do to make it run more quickly is to save the results of data-formatter, unless you're introduced new feature engineering. Instructions for how to do this are included in https://github.com/ClimbsRocks/machineJS/blob/master/pySetup/testingFileNames.js

Please keep the feedback coming! This is incredibly helpful, and it's fun for me to have some easy things to work on to give everyone a better experience using this library.

Thanks! preston