ContinuumIO / elm

Phase I & part of Phase II of NASA SBIR - Parallel Machine Learning on Satellite Data
http://ensemble-learning-models.readthedocs.io
44 stars 23 forks source link

Documentation needs a flowchart overview of components #193

Open PeterDSteinberg opened 7 years ago

PeterDSteinberg commented 7 years ago

Feedback today was:

PeterDSteinberg commented 7 years ago

@gbrener This is a useful ML models flow chart from scikit-learn we can adapt, perhaps separately from the flow chart you and I worked on over the last week. The scikit-learn flowchart shows how to choose an estimator based on type of data (labeled or not) and data size.

Also paraphrasiing the top section of this tutorial (the Machine Learning Problem Setting section) would be helpful for Elm's range of ML options.

PeterDSteinberg commented 7 years ago

Let's also summarize these two JSON's of supported estimators (updated as necessary at the time of working on the issue). I made these by grepping the output of pytest to see which estimators/transformers passed by themselves when run through fit/predict as well as which ones failed in combinations of transformer and estimator in 2-step pipelines.

It is related to the flowchart because part of our documentation approach should be to better explain the degree of scikit-learn support (by sklearn subpackage/class) and cross-link the help of the scikit-learn estimators/transformers with examples/docs in Elm and related tools

supported_pipelines.txt supported_estimators.txt