Dlux804 / McQuade-Chem-ML

Development of easy to use and reproducible ML scripts for chemistry.
5 stars 1 forks source link

Add Data Scaling option/requirement #39

Open Dlux804 opened 4 years ago

Dlux804 commented 4 years ago

Is your feature request related to a problem? Please describe. Some models, such as neural networks, need the data to be scaled. Currently, there is normlized rdkit feature option, but early attempts seemed like it wasn't working well. Additionally, it lacks control.

Describe the solution you'd like Implement and option (or requirement) to scale data using scikit's built in scaling functions, such as StandardScalar(). Integrate it into the ml_model class and pipeline. Store the type of scaling used for export to knowledge graph.