Closed ghost closed 10 months ago
Hi @concon23,
Thank you for your interest in our work! I have been pushing this (twice every month at least), but unfortunately, the process is very slow. Especially now that I have finished my internship at NVR. I can't guarantee when the dataset/code will be released, but I promise it will be released once the MoleculeSTM is officially published. (I have them ready for release, just need the approval.)
Regards, Shengchao
Hi, this is a great work. When the pubchemSTM dataset, the pretraining and downstream tasks code will be available? Do you have a timetable now? Thanks so much!
Hi @chao1224 Thank you for your reply! I understand your situation and I thought I have replied you, but my comment seems disappeared.. Really sorry if my response did not reach you.
I have another question about the computational cost. How many GPUs did you use to pretrain MoleculeSTM? It seems you trained all the parameters of f_t and p_t for the text encoder or only the parameters of p_t for the ablation study.
For each, how many computational times are required on how many GPUs?
Thank you very much:)
Hi @concon23 @Greay83 , the codes are available now.
For MoleculeSTM pretraining, we used one single A100 GPU card and about two days for 32 epochs.
@chao1224 , thank you for your notice! I would like to try it, and also thank you about the computational resource. Sincerely.
Hi, thank you for opening this repository. Your work is quite interesting! I would like to run your code from the pretraining and downstream tasks, and then is there any expectation when your dataset and code will be publicly available?
Thank you in advance:)