Closed professorCKone closed 2 years ago
Nice and very difficult example. In most cases, atom numbers are not placed directly on the atom position and then DECIMER does a good job, but this particular case will always be hard. Thanks for submitting. We'll see how future versions of DECIMER deal with this. As a first step, I suggest we create a high-res version of this and see how it copes.
Thank you for the quick feedback. I know that Decimer does a great job with regular structures. Let me create a high res version. This might do the trick.
Von: Christoph Steinbeck @.> Datum: Montag, 23. Mai 2022 um 12:32 An: OBrink/DECIMER_Web @.> Cc: Christian Kronseder @.>, Author @.> Betreff: Re: [OBrink/DECIMER_Web] Certain molecule representation create the wrong result (Issue #28)
Nice and very difficult example. In most cases, atom numbers are not placed directly on the atom position and then DECIMER does a good job, but this particular case will always be hard. Thanks for submitting. We'll see how future versions of DECIMER deal with this. As a first step, I suggest we create a high-res version of this and see how it copes.
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FOBrink%2FDECIMER_Web%2Fissues%2F28%23issuecomment-1134501762&data=05%7C01%7Cchristian.kronseder%40fhnw.ch%7C4bc9dc4bf5c94c19092c08da3ca78eea%7C9d1a5fc8321e4101ae63530730711ac2%7C0%7C0%7C637888987605900752%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ImI8vs8NhRK2bkKiCemYLD7yFBSGq%2F%2BSf5kQiAFayZw%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FASZVVFYM24L4HYPXL3YCAODVLNNEFANCNFSM5WVLEGSQ&data=05%7C01%7Cchristian.kronseder%40fhnw.ch%7C4bc9dc4bf5c94c19092c08da3ca78eea%7C9d1a5fc8321e4101ae63530730711ac2%7C0%7C0%7C637888987605900752%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=BlFdAs9SWBWWQP7eWKm9lQRvCmNV7izWaQKYgERzJYU%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>
@professorCKone Could you kindly send us a High-resolution version of this image?
I don't have a hi-res version of the originla, which is exactly my problem. I tried to make NMR annotated gifs machine reabable, but had to give up due to the low resolution of the available images. It seems that numbers in a hi-res version don't bother your deep learning approach. The result is correct, but in order to be less guessing and more precise you need to test a few more I suppose. RDKit allows you to create molecules with numbers btw. Rgds Christian
@professorCKone
Thanks a lot for this overall report, As you mentioned I could see that with the higher resolution Image decimer.ai works perfectly well.
The problem here I could see is that in the original image the number "5" is too similar to the letter "S". We did implement molecules with atom numbers depicted within probably we should increase the augmentations on such numbers as well.
Other complications in lo-res are 6, 9 and 8, which can be read as O. I had several variations of this problem. Still well done from your side with decimer. We will have a closer look and see if we can integrate it in our electronic lab journal Best regards, Christian
Christian, if you have an interesting data set of that type to OCSR, then there is of course always the possibility of retraining DECIMER with fabricated noisy, low-res images of the same type. Annotated nmr data are close to our heart (if this is what this is :)) and we could try to work together on this. If you want to take this off github, feel free to send me an email.
Cheers, Chris
— Prof. Dr. Christoph Steinbeck Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Secretariat: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814
What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3..
On 23. May 2022, at 15:00, Christian Kronseder @.***> wrote:
Other complications in lo-res are 6, 9 and 8, which can be read as O. I had several variations of this problem. Still well done from your side with decimer. We will have a closer look and see if we can integrate it in our electronic lab journal Best regards, Christian
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.
Hi Christoph, I have about 13'000 of those noisy pictures. The goal is to automatically interpret NMR spectra by using deep learning.
How about a quick video call and I can explain what I am working at?
Best Christian
Von: Christoph Steinbeck @.> Datum: Montag, 23. Mai 2022 um 17:01 An: OBrink/DECIMER_Web @.> Cc: Christian Kronseder @.>, Mention @.> Betreff: Re: [OBrink/DECIMER_Web] Certain molecule representation create the wrong result (Issue #28) Christian, if you have an interesting data set of that type to OCSR, then there is of course always the possibility of retraining DECIMER with fabricated noisy, low-res images of the same type. Annotated nmr data are close to our heart (if this is what this is :)) and we could try to work together on this. If you want to take this off github, feel free to send me an email.
Cheers, Chris
— Prof. Dr. Christoph Steinbeck Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Secretariat: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814
What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3..
On 23. May 2022, at 15:00, Christian Kronseder @.***> wrote:
Other complications in lo-res are 6, 9 and 8, which can be read as O. I had several variations of this problem. Still well done from your side with decimer. We will have a closer look and see if we can integrate it in our electronic lab journal Best regards, Christian
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FOBrink%2FDECIMER_Web%2Fissues%2F28%23issuecomment-1134788275&data=05%7C01%7Cchristian.kronseder%40fhnw.ch%7Caed04566fb914c1878a908da3ccd232d%7C9d1a5fc8321e4101ae63530730711ac2%7C0%7C0%7C637889149001146653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sI%2F0MvoCGyQnoCUkgUmMDSKFBGbMXV0pvp3p2zduRWI%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FASZVVFYH35URKWVLNXQHZODVLOMUHANCNFSM5WVLEGSQ&data=05%7C01%7Cchristian.kronseder%40fhnw.ch%7Caed04566fb914c1878a908da3ccd232d%7C9d1a5fc8321e4101ae63530730711ac2%7C0%7C0%7C637889149001146653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FvqXpoqS%2BbitTXhD4fND4RfyZ39UJSkAnYPDkUZKJvw%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
I am going to close this issue here for now, as it technically is a problem of the OCSR engine, and not a problem of the web application. We are continuously working on the further diversification of our training data in order to increase DECIMER's capabilities in future versions.
The numbers in the molecule are part of representation for NMR purposes. Number "5" is interpreted as "S" (sulfur). I guess this might have to do with the resolution of the png file.