swansonk14 / SyntheMol

Combinatorial antibiotic generation
MIT License
83 stars 19 forks source link

no use_gpu option in the function of chemprop_predict #5

Closed jklyu0824 closed 6 months ago

jklyu0824 commented 6 months ago

Really appreciate that the authors make everything available to reproduce the results in this great paper. I tried to reproduce the steps in https://github.com/swansonk14/SyntheMol/blob/main/docs/antibiotics.md, but I got the error below when running "python scripts/models/train.py \ --data_path data/Data/1_training_data/antibiotics.csv \ --save_dir data/Models/antibiotic_chemprop \ --dataset_type classification \ --model_type chemprop \ --property_column antibiotic_activity \ --num_models 10"

TypeError: chemprop_predict() got an unexpected keyword argument 'use_gpu'

Is there un updated version of the function below to enable gpu predictions? or maybe it doesn't need gpu for predictions?

def chemprop_predict( model: MoleculeModel, smiles: list[str], fingerprints: np.ndarray | None = None, num_workers: int = 0 ) -> np.ndarray:

swansonk14 commented 6 months ago

Hi @jklyu0824,

Thank you for bringing up this issue! I made a fix to remove the use_gpu flag during prediction since the script I wrote for prediction can't directly use a GPU (https://github.com/swansonk14/SyntheMol/pull/6).

Prediction is relatively fast, so it's okay to run it fully on CPU. For reproducibility, I used CPU for both training and prediction since it avoids some of the inherent randomness occasionally present in GPU operations.

However, if you're interested in training your own new models, I would recommend using the chemprop_train and chemprop_predict command line tools as shown in the README (https://github.com/swansonk14/SyntheMol?tab=readme-ov-file#train-model). Those command line tools automatically use a GPU (if available) for both training and prediction and therefore tend to be quite a bit faster than training via the scripts/models/train.py script.

Best, Kyle