Dlux804 / McQuade-Chem-ML

Development of easy to use and reproducible ML scripts for chemistry.
5 stars 1 forks source link

Why no rdkit2d for Neurals are KNN? #10

Closed Dlux804 closed 4 years ago

Dlux804 commented 4 years ago

@qle2 Why do you remove the rdkit descriptors for two models? https://github.com/Dlux804/McQuade-Chem-ML/blob/404a1fbeac769576813e50b7e549f6adffa52567/Mark-VIII-csv.py#L67-L73

Dlux804 commented 4 years ago

It's because they need to be normalized, right?

Dlux804 commented 4 years ago

@qle2 Please confirm my response.

qle2 commented 4 years ago

@Dlux804 That's correct. KNN also needs data normalization. I have tested other Normalizing and Scaling Method available in scikit-learn on all 3 dataset for NN but they gave either worse or the same results as the rdkit2dnormalized.