Open muammar opened 3 months ago
Good catch, can you check whether all your molecules parses correctly without any issues ? (use the datamol to_mol
function for that).
Ideally, you have to filter out all invalid molecules because of how the lilly code handles them.
Good catch, can you check whether all your molecules parses correctly without any issues ? (use the datamol
to_mol
function for that).Ideally, you have to filter out all invalid molecules because of how the lilly code handles them.
They all parse without issues. I used this code:
Thanks for your fast reply 😄
dm.to_mol
can return None. Can you check if any of the molecule is None ? Also it helps to standardize the list of molecules so that the smiles are canonical.
dm.to_mol
can return None. Can you check if any of the molecule is None ? Also it helps to standardize the list of molecules so that the smiles are canonical.
I passed the Mol
objects to rdMolStandardize.Cleanup
and created a pandas Series
to count all nans. Ther are zero. Let me know if you would require more information.
Thanks.
Ok, this is indeed weird. If you are able to share your SDF for me to debug, that would be nice. Otherwise if you have an alternative SDF, it will be helpful.
Ok, this is indeed weird. If you are able to share your SDF for me to debug, that would be nice. Otherwise if you have an alternative SDF, it will be helpful.
Thank you for your fast responses. I will check if I can share the SDF file for you to debug. I don't have another SDF that could be used to reproduce this error.
@muammar, any updates on this to share ?
@muammar, any updates on this to share ?
I had to use the ruby implementation of the rules to keep the ball rolling. The library presenting the problem is the Maybridge HitCreator. I'm unsure whether I can share the SDF, but they have a request form. Thank you for your fast responses, @maclandrol
I'm processing an SDF file that fails with the following error:
Based on the traceback error, some molecules are not returning results when computed with the parallel backend. Because there is a mismatch between the original
pd.DataFrame
and the one with results, pandas cannot proceed. I will try to understand what molecules are failing but I think it would be good if at least the library could catch the error and populate the result withnp.nan
. Would you have any suggestions?Best,