HelloJocelynLu / t5chem

Transformer-based model for chemical reactions
MIT License
58 stars 14 forks source link

Does the model have been pre-trained for different tasks(product、Reactant...)? #10

Closed MJ-Zeng closed 1 year ago

MJ-Zeng commented 1 year ago

hello , I'm confused that whether the subtasks in the paper should be pre-trained? Whether you need to introduce pubchem data and pre-train?

HelloJocelynLu commented 1 year ago

Hi,

All T5Chem tasks can be trained directly from scratch. For the best performance, we pretrained the model on pubchem molecules and have the trained weights available here (Yes! So the user does not need to worry about pretraining by their own, instead, they can directly download a pretrained model).

Our experiments show that finetuning model based on this pertrained weights generally gives better performance. Another accessible model USPTO_500_MT is a finetuned model for multi-tasking on USPTO dataset, and is ready for use without further training.

Hope it helps!

MJ-Zeng commented 1 year ago

Thanks,I got it.

I have a question about whether the performance of the model is improved by the introduction of pubchem data or by multi task training?

If pretrian is not used, will yield projections exceed yield-bert?(buchwald C-N coupling)

HelloJocelynLu commented 1 year ago

I think pretraining improves more on smaller datasets, while multi-tasking may work better for larger datasets (for example, for USPTO_MT, pretraining does not show much improvements for full dataset, but shows significant improvements on sample dataset -- but will need more experiments to support this statement. I guess it also varies case by case tho, I will definitely try pretrained model first.) Without pretraining, yield prediction shows similar performance as yield-BERT (if we take std and experimental error into consideration)

MJ-Zeng commented 1 year ago

ok, thanks

MJ-Zeng commented 1 year ago

Excuse me again, the dataset USPTO_500_MT, the results of multi-task training in the test do not seem to perform better than single-task performance? I'm not quite sure about this result.

image

HelloJocelynLu commented 1 year ago

I would suggest you to take a look at Figure 4 from manuscript. We find that the most effective multi-task training is to combine tasks with same types. It significantly reduces the quality of predicted SMILES, especially when more molecules are generated (i.e.: top-k accuracy when k > 1)

MJ-Zeng commented 1 year ago

ok,thanks