Ramprasad-Group / canonicalize_psmiles

Tool for the canonicalization of Polymer SMILES (P🙂) strings
Other
18 stars 3 forks source link

Different operating systems result in differnet psmiles #8

Open jdkern11 opened 1 year ago

jdkern11 commented 1 year ago

80 of 14299 polymers tested will have different psmiles depending on which operating system is being used. If you clone the package on MAC and test, you will get different results for these psmiles than on linux. In the attached file is a script to test a list of smiles (non_canonical_smiles) and their canonical version depending on the operating system being used (linux vs mac).

I already tested different version of RDKit being the cause as well as different Python versions. If I use same version of python and rdkit on different operating systems, I get different results. I've tested rdkit versions 2022.3.5 as well as 2023.3.2 and python 3.9 to 3.11. It seems it is exclusively an issue with OS and not rdkit or python version.

mac_linux_discrepancy.zip

kuelumbus commented 1 year ago

We can repo and test this by running the canonicalization on GHActions because it has Linux, Macos, and Windows workers.