ebi-chebi / ChEBI

Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds.
https://www.ebi.ac.uk/chebi
Creative Commons Attribution 4.0 International
43 stars 10 forks source link

chebi_complete.sdf has multiple newlines between properties #171

Closed muthuvenkat closed 8 years ago

muthuvenkat commented 15 years ago

The latest chebi_complete.sdf has two newlines after the SMILES property on molecule 3309 Since SDF is a legacy format where newline actually means something, we suggest to implement a quality control on the generated SDF to check for these occurrences before future releases.

Here is the faulty entry (last long property omitted):

ChEBI

19 19 0 0 0 0 0 0 0 0 8 V2000 13.7932 -7.5176 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 13.7932 -8.8476 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6413 -6.8526 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6413 -9.5126 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.4895 -7.5176 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.4895 -8.8476 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6413 -5.5226 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 12.6413 -10.8426 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 10.3377 -6.8526 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 10.3377 -9.5126 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 9.1859 -8.8476 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 9.0820 -7.5240 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 14.9467 -9.5380 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 15.0227 -6.8906 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 16.1754 -7.5240 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 17.5180 -6.9666 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 18.7087 -7.5620 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 20.0894 -7.0046 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 17.5180 -5.7760 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2 1 2 0 0 0 0 3 1 1 0 0 0 0 4 2 1 0 0 0 0 5 3 2 0 0 0 0 6 4 2 0 0 0 0 6 5 1 0 0 0 0 7 3 1 0 0 0 0 8 4 1 0 0 0 0 9 5 1 0 0 0 0 10 6 1 0 0 0 0 11 10 1 0 0 0 0 12 9 1 0 0 0 0 13 2 1 0 0 0 0 14 1 1 0 0 0 0 15 14 1 0 0 0 0 16 15 2 0 0 0 0 17 16 1 0 0 0 0 18 17 1 0 0 0 0 19 16 1 0 0 0 0 M STY 1 1 SRU M SLB 1 1 1 M SCN 1 1 HT M SAL 1 5 14 15 16 17 19 M SDI 1 4 14.4020 -7.8406 14.4020 -6.5740 M SDI 1 4 19.4054 -6.6500 19.4054 -7.9166 M SMT 1 n M END > <ChEBI ID> CHEBI:17976

> <ChEBI Name> ubiquinols

> <Secondary ChEBI ID> CHEBI:9851 CHEBI:27182 CHEBI:15278

> <SMILES> [H]C\C(C)=C\CC1=C(C)C(O)=C(OC)C(OC)=C1O

> <InChI> InChI=1/C14H20O4/c1-8(2)6-7-10-9(3)11(15)13(17-4)14(18-5)12(10)16/h6,15-16H,7H2,1-5H3

> <InChIKey> InChIKey=TVLSKGDBUQMDPR-UHFFFAOYAR

> <Formulae> C14H20O4(C5H8)n

> <Charge> 0

> <Mass> 252.30620

> <Synonyms> CoQH2 QH(2) QH2 Ubiquinol coenzymes QH2 reduced ubiquinone ubiquinol

> <CAS Registry Numbers> 56275-39-9

> <IntEnz Database Links> EC 1.1.5.2 EC 1.3.5.1 EC 1.5.5.1 EC 1.6.5.3

> <KEGG COMPOUND Database Links> C00390

> <PubChem Database Links> 8143934

> <Reactome Database Links> REACT_6169 REACT_6300 REACT_6310 REACT_6360 REACT_669

Reported by: ospjuth

muthuvenkat commented 15 years ago

We have resolved the issue in Bioclipse, see http://pele.farmbio.uu.se/cgi-bin/bugzilla/show\_bug.cgi?id=1374. This means that Bioclipse can now read these files, and will also output a warning to the log if incorrect patterns are found.

Original comment by: ospjuth

muthuvenkat commented 8 years ago

Original comment by: muthuvenkat