ClimbsRocks / machineJS

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
https://github.com/ClimbsRocks/auto_ml
408 stars 62 forks source link

modularization ideas #51

Closed ClimbsRocks closed 9 years ago

ClimbsRocks commented 9 years ago
  1. data-formatter: format a dataset for neural networks
  2. best-brain: grid search for brain.js. Just returns the most optimized neural network available- no predictions or anything. would not include the extra training time, but would include instructions on how to warmStart the brain returned to you so you could control the extra training time yourself.
  3. automated-brain: combining these two to automate the entire process of making predictions against a dataset using a neural network. would add in the making predictions part
  4. ensembler- takes in the prediction files from various other ml algos, and ensembles them together in creative ways
  5. python-data-formatter: gets data ready for machine learning in python's scikit-learn library (summarizes it so the user can easily spot errors, runs the same transformations against the combined training/testing data set so they're binarized/normalized/whateverized in the same way, imputes missing values, etc.). ideally this would be flexible enough to format it for different ml algos (maybe svms need normalization, while random forests don't)
  6. automated-machine-learning: run all the python classifiers, making predictions agains the datasets and writing those to predictions files
  7. assembling all this together to make a single master predictions file at the end with all these ml algos
ClimbsRocks commented 9 years ago

the spirit of this issue has been solved, though i've increased maintainability even further than expected through removing brain.js entirely.

data-formatter is it's own module ensembling together is it's own module ppComplete is it's own module with all the controlling logic.