Closed ziyueyang37 closed 3 months ago
phosphorus is not included the default LibInvent prior. Here is the token alphabet for reference:
Thanks! Is there a way that we can add P as an allowed element? Do we have to retrain the model? and how much data is needed for that?
If you need additional elements/tokens you would need to train a new model with source data containing relevant examples. There is no simple recipe as to how many but P compounds are typically not that abundant, maybe 1% in ChEMBL.
Hi team,
I used libinvent to generate thousands of SMILES but didn't notice there exists any phosphorus. Did P got excluded in the possible elements or it wasn't in the training set?
Best