sherwin97 / ML-Project----Predicting-solubility-

A personal project to predict solubility of given molecules using their molecular descriptors.
1 stars 1 forks source link

better modularisation #2

Closed linminhtoo closed 2 years ago

linminhtoo commented 2 years ago

https://github.com/sherwin97/ML-Project----Predicting-solubility-/blob/5f50a0998def4d6dfa5ad4b0201555ede0841335/data.py#L69

it is not good to keep the processed data alive like this. you should save it to disk, to an output path specified by the user.

then, in another script, you can load it from that output path (provided by user as an argument). this also allows you to check the output processed csv & analyse / debug it.

this way, everytime you run train, it won't have to reprocess the data from scratch too.


in the same vein, for the predict.py, you should be loading a trained model that has been saved to disk. so, your train script should have saved a model to disk.

linminhtoo commented 2 years ago

addressed by #4