Open jcuadros opened 1 year ago
Jordi, yes I agree that is a great idea. Thanks.
I just updated the page with a note about this (see commit 27a757f). Let me know if this appropriate and if not how should I amend it?
Can I close this issue?
I think (but it may be my English) that the second sentence is missing something. In any case, I would add a note (or something in the note) that states that the formula will be the corresponding to the neutral specie in the case the represented compound has protons either added or removed in the charge layer.
I just updated the note to make it better English (commits 342cc72 and f762c4f). Thanks for spotting that. As to you other point I believe I have addressed that in the second revision (f762c4f). Let me know what you think.
The first issue (salts) sounds great now. Thanks!
The second issue (charged species) still needs some work (IMO). The charge layer has two sublayers /q and /p which are used to specify different types of charged compounds.
When the /q sublayer is present, the formula does not show the charge but can be understood as correct. For example, InChI=1S/I3/c1-3-2/q-1 is the InChI for triiodide ion and InChI=1S/C2H5O/c1-2-3/h2H2,1H3/q-1 is ethanolate.
The problem comes when the /p sublayer is present. In this case, the formula does not correspond to the charged species. For example, the sulfate ion is InChI=1S/H2O4S/c1-5(2,3)4/h(H2,1,2,3,4)/p-2 The formula corresponds to sulfuric acid. The citrate ion is InChI=1S/C6H8O7/c7-3(8)1-6(13,5(11)12)2-4(9)10/h13H,1-2H2,(H,7,8)(H,9,10)(H,11,12)/p-3 but its molecular formula is C6H5O7.
Some are still wilder as the dimercury(I) ion InChI=1S/2Hg/q2*+1 but I would skip those.
Ah, I get your point that I did not address that some species that have an InChI are charged and that is not covered currently in the text. Let me add that also.
@jcuadros Are there still things that need to be fixed here, or can I close this issue?
Step 4, "Extract the formula of the substance", may be incorrect when the InChI includes a /p sublayer or when the InChI includes disconnected species (see examples below). It might be worth adding a comment stating this works for neutral covalent chemical species.
Citrate, https://pubchem.ncbi.nlm.nih.gov/compound/31348 InChI=1S/C6H8O7/c7-3(8)1-6(13,5(11)12)2-4(9)10/h13H,1-2H2,(H,7,8)(H,9,10)(H,11,12)/p-3
Ethylammonium nitrate, https://pubchem.ncbi.nlm.nih.gov/compound/6432248 InChI=1S/C2H7N.NO3/c1-2-3;2-1(3)4/h2-3H2,1H3;/q;-1/p+1
Thanks!