seyonechithrananda / bert-loves-chemistry

bert-loves-chemistry: a repository of HuggingFace models applied on chemical SMILES data for drug design, chemical modelling, etc.
MIT License
389 stars 60 forks source link

Question about chirality and model results from paper #59

Open kosonocky opened 1 year ago

kosonocky commented 1 year ago

Hi,

ChemBERTa was trained on achiral canonicalized molecules, as evidenced by the achiral canonicalized dataset.

The MoleculeNet fine-tuning datasets in the paper contain chiral molecules. How was this addressed?

Did you canonicalize the chiral molecules to make them achiral canonicalized? What about duplicates? (since there are stereoisomers)

Or, did you just inference on the chiral strings? This would certainly cause some problems.

I didn't see any mention of this in the paper.

Please let me know, Clay