IUPAC / IUPAC_SMILES_plus

IUPAC SMILES+ Specification
Other
34 stars 9 forks source link

Polymers extension unclear #9

Open merkys opened 3 years ago

merkys commented 3 years ago

I have filed an issue https://github.com/timvdm/OpenSMILES/issues/8 requesting explanation of the polymers extension. Since OpenSMILES GitHub repository does not attract much attention, I would like to repeat it here too. Moreover, I believe IUPAC's SMILES specification would benefit from unambiguously describing this extension.

vfscalfani commented 3 years ago

Sorry for the delay, I'll see what I can find about the Weininger polymer extension proposal. There may be an archived talk somewhere. We should be finishing up the first batch of revisions soon (see #8), and then I can take a look at the proposed extensions. Thanks for this comment.

merkys commented 3 years ago

Thank you for the response. I would appreciate any reference on this extension.

vfscalfani commented 3 years ago

okay, I was not able to find anything from Daylight, but this OpenSMILES mailing list message:

https://sourceforge.net/p/blueobelisk/mailman/message/27232617/

has an important clue in that there is actually a mistake with the polystyrene notation, it should be c1ccccc1C&1C&1, not c1ccccc1C&1&1. Using this notation, I agree with you that I would expect a repeating saturated alkane to then be C&1&1.

Would you agree that poly(ethylene oxide) would then be:

CC&1O&1 ?

So for diamond, C&1&1&1&1, I think each &1 is representing a bond to another carbon atom. For diamond, there are 4, and in graphite, there are only 3 defined. See the images here:

https://chem.libretexts.org/Bookshelves/Inorganic_Chemistry/Map%3A_Inorganic_Chemistry_(Housecroft)/14%3A_The_Group_14_Elements/14.04%3A_Allotropes_of_Carbon/14.4A%3A_Graphite_and_Diamond_-_Structure_and_Properties

Does this help at all? If we are in agreement with the interpretation, I think we may want to draw some depictions and add in several more examples. I am certainly open to contributions!

Vin

merkys commented 3 years ago

Thanks a lot for giving this a look! I completely agree with you regarding polystyrene, repeating saturated alkane, diamond and graphite. However, I would write poly(ethylene oxide) as C&1CO&1, as in CC&1O&1 one of the carbons is on the side chain on its own instead of appearing in the same linear chain. Would you agree?

I am trying to think of more elaborate use cases. Let's imagine a 2D material where we would have (-C-)N along the X axis and (-C-O-)N along the Y axis. I would write such material as C&1&1&2O&2. Does this seem reasonable?

This does help me indeed. It also would be great to have a more extended explanation with more examples in the SMILES+ specification. I will try to come up with more examples and illustrations.

merkys commented 3 years ago

Two more examples to illustrate the need to also include bond types:

merkys commented 2 years ago

Very similar method to describe periodic chemical graphs is presented in Eon (2016), Section 2.2.