Closed ghost closed 1 year ago
Hi concon23, pretraining on the full PubChem18 dataset should take around 2-5 days with a modest consumer GPU after preprocessing the data. You can follow the instructions in the reproduce section of the readme. Hope you manage, otherwise I'm happy to help.
Hi @phseidl Thank you for your kind reply. I understand that. It is friendly to users with common computational resources!
Sincerely:)
Hi @phseidl Sorry for asking a question again. python clamp/train.py --dataset=./data/fsmol --assay_mode=clip --split=FSMOL_split The above command runs the pretraining? Or runs a few shot training or something other?
Thank you in advance:)
Hi @concon23,
this performs pretraining and evaluates it on zero-shot.
To run few-shot you can add --support_set_size=k
where k is the number of support-samples you want.
Best, Philipp
Hi, thank you for sharing your great work! I am interested in the concept of your paper and would like to try pretraining as written in your paper. How can I pretrain using this repository? Another question is about the computational resource. In your paper, it took total 170 days and 800 times. Does the pretraining require the same computational time? Is it possible to pretrain using a single GPU?
Thank you in advance:)