Dlux804 / McQuade-Chem-ML

Development of easy to use and reproducible ML scripts for chemistry.
5 stars 1 forks source link

Return variable of function "featurization" in features.py #13

Closed qle2 closed 4 years ago

qle2 commented 4 years ago

@Dlux804 I noticed that you change one of the return variable in the function featurization in features.py from selected_feat to num_feat. Is there a reason for that change?

image

I'm asking this because one of my test depends on that variable to be a list of strings (selected_feat) instead of list of numbers (num_feat). I have tested models.py when the return variable is selected_feat and it works just fine.

Do you mind if we switch variable from num_feat to selected_feat?

Dlux804 commented 4 years ago

@qle2 I switched it to num_feat so that it would return the numerical index of the feature selection, i.e 0 for rdkit2d. The reason behind this is that it makes for nicer file names to have 0-4 instead of rdkit2d-morganfeaturecount in file names.

You can update your test function to look for a list of integers instead of a list of strings. It shouldn't make a difference in testing. If you understand, please update your test and close the issue.

qle2 commented 4 years ago

@Dlux804 One of the goal of my testing script for features.py is to see if it can drop the 'rdkit2d' column in the right circumstances. So I needed the actual string to see if 'rdkit2d' is still there. But I think I can figure out a way to work around it.

Dlux804 commented 4 years ago

@qle2 Has this been resolved? We discussed it in person I believe. If it is resolved, please close. If not, please explain what extra info you need.

qle2 commented 4 years ago

I believe everything has been resolved. Thanks for the reminder.