chembl / GLaDOS

Web Interface for ChEMBL @ EMBL-EBI
https://www.ebi.ac.uk/chembl/
Other
45 stars 5 forks source link

SMILES searches in search bar versus structure search #1257

Open Emma-Manners opened 4 years ago

Emma-Manners commented 4 years ago

Works in the search bar but not in structure search, I'm not sure why (the vast majority of SMILES work well).

O=C(NC(C(C1=CC=CC=C12)=NNC2=O)C(N/N=C(C)/C3=CC=CC(Br)=C3)=O)C4=CC=CC=C4

Screenshot 2020-04-27 at 10 13 34
nclopezo commented 4 years ago

@eloyfelix The error seems to be when it calls the ctab2smiles endpoint. You can reproduce it with this command:

curl 'https://www.ebi.ac.uk/chembl/api/utils/ctab2smiles' \
  -H 'Connection: keep-alive' \
  -H 'Pragma: no-cache' \
  -H 'Cache-Control: no-cache' \
  -H 'Accept: */*' \
  -H 'X-Requested-With: XMLHttpRequest' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36' \
  -H 'Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryOyI4BbAWuRexqLss' \
  -H 'Origin: https://www.ebi.ac.uk' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Referer: https://www.ebi.ac.uk/chembl/' \
  -H 'Accept-Language: es,en-GB;q=0.9,en;q=0.8,en-US;q=0.7,gl;q=0.6' \
  -H 'Cookie: chembl_sketcher=marvinjs; _ga=GA1.3.12510205.1475154173; _ga=GA1.1.12510205.1475154173; marvinjs=<cml><MDocument><MChemicalStruct><molecule molID="m1"><atomArray><atom id="a1" elementType="C" x2="-1.49333332139" y2="0"/><atom id="a2" elementType="C" x2="-0.746666660693" y2="-1.29322665632"/><atom id="a3" elementType="C" x2="0.746666660693" y2="-1.29322665632"/><atom id="a4" elementType="C" x2="1.49333332139" y2="0"/><atom id="a5" elementType="C" x2="0.746666660693" y2="1.29322665632"/><atom id="a6" elementType="C" x2="-0.746666660693" y2="1.29322665632"/><atom id="a7" elementType="C" x2="2.98666664277" y2="0"/><atom id="a8" elementType="C" x2="3.73333330347" y2="-1.29322665632"/><atom id="a9" elementType="C" x2="5.22666662485" y2="-1.29322665632"/><atom id="a10" elementType="C" x2="5.97333328555" y2="0"/><atom id="a11" elementType="C" x2="5.22666662485" y2="1.29322665632"/><atom id="a12" elementType="C" x2="3.73333330347" y2="1.29322665632"/><atom id="a13" elementType="C" x2="-2.98666664277" y2="0"/><atom id="a14" elementType="C" x2="-3.73333330347" y2="1.29322665632"/><atom id="a15" elementType="C" x2="-5.22666662485" y2="1.29322665632"/><atom id="a16" elementType="C" x2="-5.97333328555" y2="0"/><atom id="a17" elementType="C" x2="-5.22666662485" y2="-1.29322665632"/><atom id="a18" elementType="C" x2="-3.73333330347" y2="-1.29322665632"/></atomArray><bondArray><bond atomRefs2="a1 a2" order="2"/><bond atomRefs2="a2 a3" order="1"/><bond atomRefs2="a3 a4" order="2"/><bond atomRefs2="a4 a5" order="1"/><bond atomRefs2="a5 a6" order="2"/><bond atomRefs2="a6 a1" order="1"/><bond atomRefs2="a7 a8" order="2"/><bond atomRefs2="a8 a9" order="1"/><bond atomRefs2="a9 a10" order="2"/><bond atomRefs2="a10 a11" order="1"/><bond atomRefs2="a11 a12" order="2"/><bond atomRefs2="a12 a7" order="1"/><bond atomRefs2="a4 a7" order="1"/><bond atomRefs2="a13 a14" order="2"/><bond atomRefs2="a14 a15" order="1"/><bond atomRefs2="a15 a16" order="2"/><bond atomRefs2="a16 a17" order="1"/><bond atomRefs2="a17 a18" order="2"/><bond atomRefs2="a18 a13" order="1"/><bond atomRefs2="a1 a13" order="1"/></bondArray></molecule></MChemicalStruct></MDocument></cml>; experimentation_subject_id=IjE0ZTlkM2ExLTc3YTQtNDQwOC04N2NhLWI1OGRmODU1ODMzNSI%3D--efad8bd2673c3b9d66dfb63fb694f9f6cd604d85; __utma=222765775.12510205.1475154173.1553781848.1579009629.8; __utmz=222765775.1579009629.8.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); cookies-accepted=true; cookies-accepted=true; csrftoken=kGlxsmFb6Y3FduzPJTQMcYAOXRnt6TpKqzDnOYOayVTsFyBykGjNFsv16NHs0nCZ; mywork.tab.tasks=false; chembl-website-v0.2-data-protection-accepted=true; __atuvc=0%7C12%2C0%7C13%2C0%7C14%2C0%7C15%2C1%7C16; _gid=GA1.3.2082011833.1587999822; _gat_gtag_UA_137029308_1=1' \
  --data-binary $'------WebKitFormBoundaryOyI4BbAWuRexqLss\r\nContent-Disposition: form-data; name="file"; filename="molecule.mol"\r\nContent-Type: chemical/x-mdl-molfile\r\n\r\n\n  MJ182100                      \n\n 34 37  0  0  0  0  0  0  0  0999 V2000\n   -0.4340   -1.0013    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0\n   -1.3004   -0.5019    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -1.3012    0.4980    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n   -0.4356    0.9986    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -0.4364    1.9986    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -1.3028    2.4980    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -2.1684    1.9972    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -3.0348    2.4966    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -3.0356    3.4966    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -2.1700    3.9974    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -1.3036    3.4980    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.4291    2.4994    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n    0.4283    3.4994    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n   -0.4380    3.9988    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -0.4388    4.9986    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0\n    0.4307    0.4994    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.4315   -0.5005    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n    1.2979   -0.9999    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n    1.2987   -2.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.4332   -2.5006    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    2.1651   -2.4991    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    3.0308   -1.9986    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    3.8972   -2.4977    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    3.8980   -3.4978    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    3.0324   -3.9986    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    3.0332   -4.9986    0.0000 Br  0  0  0  0  0  0  0  0  0  0  0  0\n    2.1659   -3.4992    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    1.2963    1.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0\n   -2.1660   -1.0027    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -3.0324   -0.5033    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -3.8980   -1.0041    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -3.8972   -2.0040    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -3.0308   -2.5033    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n   -2.1652   -2.0025    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n  1  2  2  0  0  0  0\n  2  3  1  0  0  0  0\n  3  4  1  0  0  0  0\n  4  5  1  0  0  0  0\n  5  6  4  0  0  0  0\n  6  7  4  0  0  0  0\n  7  8  4  0  0  0  0\n  8  9  4  0  0  0  0\n  9 10  4  0  0  0  0\n 10 11  4  0  0  0  0\n  5 12  4  0  0  0  0\n 12 13  4  0  0  0  0\n 13 14  4  0  0  0  0\n 14 15  2  0  0  0  0\n  4 16  1  0  0  0  0\n 16 17  1  0  0  0  0\n 17 18  1  0  0  0  0\n 18 19  2  0  0  0  0\n 19 20  1  0  0  0  0\n 19 21  1  0  0  0  0\n 21 22  4  0  0  0  0\n 22 23  4  0  0  0  0\n 23 24  4  0  0  0  0\n 24 25  4  0  0  0  0\n 25 26  1  0  0  0  0\n 25 27  4  0  0  0  0\n 16 28  2  0  0  0  0\n  2 29  1  0  0  0  0\n 29 30  4  0  0  0  0\n 30 31  4  0  0  0  0\n 31 32  4  0  0  0  0\n 32 33  4  0  0  0  0\n 33 34  4  0  0  0  0\n 11  6  4  0  0  0  0\n 14 11  4  0  0  0  0\n 27 21  4  0  0  0  0\n 34 29  4  0  0  0  0\nM  END\n\r\n------WebKitFormBoundaryOyI4BbAWuRexqLss\r\nContent-Disposition: form-data; name="sanitize"\r\n\r\n0\r\n------WebKitFormBoundaryOyI4BbAWuRexqLss--\r\n' \
  --compressed
eloyfelix commented 4 years ago

When I try to parse the molblock Marvin generates with RDKit it fails to kekulise. I need to check a bit more on this.

Screenshot 2020-05-15 at 18 54 07