Finetuning on a custom dataset with the Huggingface MoLFormer model?

Hi! I have been trying to use MoLFormer model from Huggingface for cancer drug response prediction model, but it seemingly struggles in comparison with MegaMolBART and ChemBERTa. Is there a way to finetune the huggingface model on a custom SMILES dataset? I was thinking that relevant molecules (oncology drugs and related scaffolds) might have been underrepresented in the MoLFormer training data, causing underperformance. Could you, please, help me to figure out the workflow for finetuning the Huggingface MoLFormer instance on a custom SMILES dataset?

IBM / molformer

Finetuning on a custom dataset with the Huggingface MoLFormer model? #22