chao1224 / MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
https://chao1224.github.io/MoleculeSTM
Other
188 stars 18 forks source link

Code available? #1

Closed ghost closed 10 months ago

ghost commented 1 year ago

Hi, thank you for opening this repository. Your work is quite interesting! I would like to run your code from the pretraining and downstream tasks, and then is there any expectation when your dataset and code will be publicly available?

Thank you in advance:)

chao1224 commented 1 year ago

Hi @concon23,

Thank you for your interest in our work! I have been pushing this (twice every month at least), but unfortunately, the process is very slow. Especially now that I have finished my internship at NVR. I can't guarantee when the dataset/code will be released, but I promise it will be released once the MoleculeSTM is officially published. (I have them ready for release, just need the approval.)

Regards, Shengchao

Greay83 commented 11 months ago

Hi, this is a great work. When the pubchemSTM dataset, the pretraining and downstream tasks code will be available? Do you have a timetable now? Thanks so much!

ghost commented 11 months ago

Hi @chao1224 Thank you for your reply! I understand your situation and I thought I have replied you, but my comment seems disappeared.. Really sorry if my response did not reach you.

I have another question about the computational cost. How many GPUs did you use to pretrain MoleculeSTM? It seems you trained all the parameters of f_t and p_t for the text encoder or only the parameters of p_t for the ablation study.

For each, how many computational times are required on how many GPUs?

Thank you very much:)

chao1224 commented 10 months ago

Hi @concon23 @Greay83 , the codes are available now.

For MoleculeSTM pretraining, we used one single A100 GPU card and about two days for 32 epochs.

ghost commented 10 months ago

@chao1224 , thank you for your notice! I would like to try it, and also thank you about the computational resource. Sincerely.