Closed ptrab closed 1 year ago
Hi @ptrab ,
Thanks a lot for the bug report.
STOUT was primarily trained on PubChem IUPAC names so at certain times it makes such mistakes. Also, you cannot use capitalized letters with STOUT because it was only trained on words with small letters. That is why you got such weird results.
We are looking into improving STOUT further using more examples. As we stated in our paper we would highly recommend using rule-based methods to translate SMILES to IUPCAN names. Also, you could try OPSIN, Which could translate IUPAC names to SMILES.
Kind regards, Kohulan
To better understand the underlying machinery, would it help to train the model with randomly upper- and lower-case characters to be fixed on lower-case letters and make it more robust?
I think I saw from image processing papers where they trained their GANs by adding some artificial noise to the input images to train the model for "real world" images and not only for perfect synthetic images.
Hi,
Today, when I tried to generate the SMILES string for 'ammonia', I got '[NH2+]' back, which is certainly wrong.
>>> STOUT.translate_reverse('ammonia') '[NH2+]'
When I tried to convert 'Ammonia', I got back a mess of weird strings.
>>> STOUT.translate_reverse('Ammonia') '[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr].[Pr'
I also tried the systematic name.
>>> STOUT.translate_reverse('azane') 'N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.N.'
>>> STOUT.translate_reverse('Azane') '[15NH3]'
I'm not sure if this is intended and I guess the error is on my side, but could you please have a look? :)
In the other direction, it works well:
>>> STOUT.translate_forward('N') 'azane'
>>> STOUT.translate_forward('[NH2+]') 'azanium'
Thank you Philipp