ebi-chebi / ChEBI

Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds.
https://www.ebi.ac.uk/chebi
Creative Commons Attribution 4.0 International
42 stars 10 forks source link

More Iupac name/structure mismatches #478

Open muthuvenkat opened 14 years ago

muthuvenkat commented 14 years ago

(CHEBI:17337) The ChEBI name describes a different structure due to the 2-chloro applying to the acetic acid. The other conjunctively named synonyms are unambiguously correct. (CHEBI:50972) Name specifies that the 4a/8a bond is unsaturated (CHEBI:38239) Name specifies 4-[(2-hydroxyethyl)amino] but structure has 4-{[(2-hydroxyethyl)amino]oxy} (CHEBI:51224) (dimethylammonio) should be (dimethylazaniumdiyl). I think the first IUPAC name with alteration is better (CHEBI:43561) Name says dimethylamino but the structure and synoymns are just methylamino (CHEBI:51615) ChEBI name should be 4-Dimethylamino-4'-nitrostilbene (CHEBI:33325) Another buggy InChI (CHEBI:52803) dimethylammonio --> dimethylazaniumdiyl and the first name is then fine (CHEBI:51224) dimethylammonio --> dimethylazaniumdiyl and the first name is then fine (CHEBI:32208) Structure is miss a ketone opposite the existing ketone (CHEBI:7563) N-nicotinoylglycine would be a better ChEBI name as Nicotinyl is usually pyridin-3-ylmethyl rather than pyridin-3-carbonyl (CHEBI:34495) Structure has an erroneous fluorine atom (CHEBI:15836) IUPAC name should have dihydrogen rather than hydrogen (CHEBI:2402) The sulfur appears to have a negative charge on it giving the whole molecule a negative charge, this seems suspicious (CHEBI:33323) wrong InChI and smiles. Seems to apply to the other group 1 atoms e.g. (CHEBI:26216) (CHEBI:3651) Name specifies that the 4a/9a bond is saturated but in the structure it is unsaturated (CHEBI:2534) locant specifying the locantion of [(1R)-1-hydroxyethyl] is missing, should be 12- (CHEBI:53051) thiocyanato instead of thiocyano would make a better ChEBI name in my opinion (CHEBI:51522) The structure has only one nitrogen but the names and indeed the formula indicate that it should have two. with the second where the methyl attaches (CHEBI:52888) dimethylammonio -->dimethyldiazandiyl (CHEBI:53754) Name describes a di ether, but structure is a diester (CHEBI:29147) Name says the 8a/1 bond is unsaturated (CHEBI:5882) Structure is a salt but name does not have charges on the nitrogen/chlorine (CHEBI:52220) dimethylammonio -->dimethyldiazandiyl (CHEBI:28907) Name/structure mismatch significantly (CHEBI:30307) Nitrogen to be substituted should be specified (CHEBI:53728) Name is missing the two nitros on the benzene ring (CHEBI:32643) ChEBI name and structure disagree on which way around the acetamido and formamido group are (CHEBI:52887) dimethylammonio -->dimethyldiazandiyl (CHEBI:39506) where the structure has phenyl the names have pyridin-4-yl (CHEBI:39506) IUPAC name should contain azaniumyl, not azanyl. Is the structure methylnitronate? (CHEBI:32314) Names says isoindolin-1-one but structure is indolin-2-one

Additionally I have a query regarding the structures of histidinate and tyrosinate. Do you know whether it specifies in the IUPAC guidelines whether the ate form of these amino acids is if unqualified (1-) or (2-)? For glutamate/aspartate it is specified that the ate form systematically means both acid groups are carboxylates. histidine/tyrosine however only have one carboxylic acid.

In my last post you stated that for names like diazanediium and hydrazinediium locants are not required. Do you know where it says this in the IUPAC draft recommendations? I have so far found one name in it that seems to be almost a counterexample: tetramethyldiazen-1,2-diium (PIN) but no definitive word.

This is the last batch of name/structure mismatches (from a copy of ChEBI I downloaded near the beginning of May). I have not so far been comparing stereochemistry between the names/structures but at a glance there would be a significant number of mismatches mainly due to either the name implying stereochemistry that has not been specified in the structure or due to the name omitting stereochemical information.

Reported by: dan2097

muthuvenkat commented 14 years ago

A couple more which appeared when I added diphospho to OPSIN's vocab. (CHEBI:978) and (CHEBI:1463) diphospho should possibly be diphosphooxy. diphosphooxy has been used on two other ChEBI entries. Personally I would be inclined to name such compounds as diphosphates as diphospho seems to often mean either P(=O)(O)OP(=O)(O)O or OP(=O)(O)OP(=O)(O)O depending on whether or not it is attached to an oxygen!

Original comment by: dan2097

muthuvenkat commented 14 years ago

Hi Daniel

Changes will be incorporated into our next release (August 4). Comments:

(CHEBI:17337) The ChEBI name describes a different structure due to the 2-chloro applying to

the acetic acid. The other conjunctively named synonyms are unambiguously correct.

the structure.

(CHEBI:50972) Name specifies that the 4a/8a bond is unsaturated

(CHEBI:38239) Name specifies 4-[(2-hydroxyethyl)amino] but structure has

4-{[(2-hydroxyethyl)amino]oxy}

(CHEBI:51224) (dimethylammonio) should be (dimethylazaniumdiyl). I think the first IUPAC

name with alteration is better

(CHEBI:43561) Name says dimethylamino but the structure and synoymns are just methylamino

(CHEBI:51615) ChEBI name should be 4-Dimethylamino-4'-nitrostilbene

(CHEBI:33325) Another buggy InChI

(CHEBI:52803) dimethylammonio --> dimethylazaniumdiyl and the first name is then fine

(CHEBI:51224) dimethylammonio --> dimethylazaniumdiyl and the first name is then fine

(CHEBI:32208) Structure is miss a ketone opposite the existing ketone

(CHEBI:7563) N-nicotinoylglycine would be a better ChEBI name as Nicotinyl is usually

pyridin-3-ylmethyl rather than pyridin-3-carbonyl

changed.

(CHEBI:34495) Structure has an erroneous fluorine atom

(CHEBI:15836) IUPAC name should have dihydrogen rather than hydrogen

(CHEBI:2402) The sulfur appears to have a negative charge on it giving the whole molecule a

negative charge, this seems suspicious

(CHEBI:33323) wrong InChI and smiles. Seems to apply to the other group 1 atoms e.g.

(CHEBI:26216)

(CHEBI:3651) Name specifies that the 4a/9a bond is saturated but in the structure it is

unsaturated

(CHEBI:2534) locant specifying the locantion of [(1R)-1-hydroxyethyl] is missing, should be

12-

(CHEBI:53051) thiocyanato instead of thiocyano would make a better ChEBI name in my opinion

(CHEBI:51522) The structure has only one nitrogen but the names and indeed the formula

indicate that it should have two. with the second where the methyl attaches

(CHEBI:52888) dimethylammonio -->dimethyldiazandiyl

(CHEBI:53754) Name describes a di ether, but structure is a diester

(CHEBI:29147) Name says the 8a/1 bond is unsaturated

(CHEBI:5882) Structure is a salt but name does not have charges on the nitrogen/chlorine

IUPAC Name formulated as "...aminium chloride". (Also, I noticed that the N was missing from

the azepine ring)

(CHEBI:52220) dimethylammonio -->dimethyldiazandiyl

(CHEBI:28907) Name/structure mismatch significantly

(CHEBI:30307) Nitrogen to be substituted should be specified

(CHEBI:53728) Name is missing the two nitros on the benzene ring

(CHEBI:32643) ChEBI name and structure disagree on which way around the acetamido and

formamido group are

(CHEBI:52887) dimethylammonio -->dimethyldiazandiyl

(CHEBI:39506) where the structure has phenyl the names have pyridin-4-yl

(CHEBI:39506) IUPAC name should contain azaniumyl, not azanyl. Is the structure

methylnitronate?

(CHEBI:32314) Names says isoindolin-1-one but structure is indolin-2-one

Additionally I have a query regarding the structures of histidinate and tyrosinate. Do you know whether it specifies in the IUPAC guidelines whether the ate form of these amino

acids is if unqualified (1-) or (2-)? For glutamate/aspartate it is specified that the ate form systematically means both acid

groups are carboxylates. histidine/tyrosine however only have one carboxylic acid.

think we may need to look at these entries in ChEBI again.

In my last post you stated that for names like diazanediium and hydrazinediium locants are

not required. Do you know where it says this in the IUPAC draft recommendations? I have so far found one name in it that seems to be almost a counterexample:

tetramethyldiazen-1,2-diium (PIN) but no definitive word.

letter ‘e’ of the parent hydride name, if any, by the suffix ‘ium’, preceded by multiplying locants ‘di’, ‘tri’, etc., to denote the multiplicity of identical

cationic centres." It seems to me from this that if there are only two possible positions

within a molecule where cationic centres can exist, then this wording dictates that when

there are actually two positve charges then one must reside on each centre, in which case

the locants (in this case 1 and 2) are redundant. I see the example that you cite and I

shall query this point with the editors.

This is the last batch of name/structure mismatches (from a copy of ChEBI I downloaded near

the beginning of May). I have not so far been comparing stereochemistry between the names/structures but at a

glance there would be a significant number of mismatches mainly due to either the name

implying stereochemistry that has not been specified in the structure or due to the name

omitting stereochemical information.

* Concerning the diphospho problem, I agree that these compounds would be better named as diphosphates were it not for the existence within the molecules of the cationic nitrogen and its 'ium' suffix. This takes priority over the phosphate ester for citation as principal group and relegates the diphosphate to being cited as a substituent prefix. In these cases we have used 'phospho' and 'diphospho' in accordance as much as possible with the 1976 CBN recommendations (http://www.chem.qmul.ac.uk/iupac/misc/phospho.html). Certainly though, use of 'diphosphooxy' would help to avoid ambiguity. I may sound out Gerry Moss on this one.

Many thanks as always for your input. I shall leave this thread open so that you can update me on the duplicate CHEBI:39506 problem (see comment above).

Regards, Marcus

Original comment by: mennis

muthuvenkat commented 14 years ago

You are correct in pointing out that I meant dimethylazaniumdiyl in all cases I said dimethyldiazandiyl

The missing ChEBI ID is (CHEBI:55327)

My reason for quibbling the diazandiium is theoretically there is ambiguity that two protons could of been added to one of the nitrogens. Admittedly a penta valent double positively charged nitrogen is extremely unlikely so I probably should perform some sanity checking when applying unlocanted charge suffixes (currently the only sanity check that is performed is that the atom has sufficient protons to carry out the operation).

I have emailed you the list of names that flag up as there being something awry stereochemistry wise.

Original comment by: dan2097