Reproducibility issues for single-step RT task

szarki9 commented 1 year ago

hi,

Im trying to reproduce the results for single-step RT, by fine-tuning pretrained model on PubMed, with the USPTO_50k dataset. Im doing this by using PyTorch Trainer: Seq2SeqTrainer, and adding accuracy as the compute_metrics argument. Nevertheless, Top1 Acc = 17.45% on the test set from USPTO_50k.

Is there a significant difference between your implementation and simply using PyTorch Trainer? I would assume that using Trainer should be sufficient to reconstruct the baseline reported by you for single task.

Could you give a hint regarding hyperparameters? Im using epochs = 100 as it was reported in the paper, and batch_size = 64. I experimented with different learning_rates, but still no significant progress towards reported scores :(

I would appreciate any help! Kinga

HelloJocelynLu commented 1 year ago

Hi Kinga,

Glad to help! I am not sure what trainer you were using as "pytorch trainer". (this? I only find this one in huggingface repo named Seq2seqTrainer...). Therefore, unfortunately it is hard for me to tell the difference between my implementation and their implementation. As a general suggestion, I would encourage you to take a look at training curves, as well as other reported values in tensorboard to see whether there is any weird behavior. Also when you say RT, is it single-step retrosynthesis task? As t5chem naturally support this task, I guess an easier way to reproduce the results is just run t5chem using original codes (original trainer...etc) before try your own ones. I may need more information to know the cause, for example, the versions of your installed packages (transformers, torchtext...etc), the command/codes you have been used to train & test your model, your machine...etc

Jocelyn

HelloJocelynLu commented 1 year ago

Hi, any updates?

szarki9 commented 1 year ago

hi, yes, sorry for the late response! I run your script and I was able to reproduce your results :)

Thanks so much for your help!

HelloJocelynLu / t5chem

Reproducibility issues for single-step RT task #15