chembl / PP-WS

An issue tracker for ChEMBL internale Pipeline-Pilot based web services
0 stars 0 forks source link

error in key for chembldescriptors method #2

Closed mnowotka closed 8 years ago

mnowotka commented 8 years ago

For this molfile:

  SciTegic10091514462D

 32 33  0  0  0  0            999 V2000
   -2.6248   -2.7816    0.0000 C   0  0
   -2.6260   -3.6085    0.0000 C   0  0
   -1.9116   -4.0211    0.0000 C   0  0
   -1.1956   -3.6080    0.0000 C   0  0
   -1.9134   -2.3691    0.0000 C   0  0
   -1.1951   -2.7793    0.0000 C   0  0
   -0.5822   -2.2228    0.0000 C   0  0
   -0.9228   -1.4687    0.0000 C   0  0
   -1.7452   -1.5592    0.0000 N   0  0
   -0.5140   -0.7526    0.0000 C   0  0
    0.2424   -2.2314    0.0000 C   0  0
    0.6620   -1.5217    0.0000 C   0  0
    1.4864   -1.5303    0.0000 N   0  0
    1.9061   -0.8206    0.0000 C   0  0
    2.7306   -0.8292    0.0000 C   0  0
    3.1323   -1.5500    0.0000 C   0  0
    3.9561   -1.5590    0.0000 C   0  0
    4.3766   -0.8487    0.0000 C   0  0
    3.9675   -0.1279    0.0000 C   0  0
    3.1450   -0.1225    0.0000 C   0  0
    5.2012   -0.8561    0.0000 C   0  0
    5.6197   -0.1459    0.0000 C   0  0
    6.4443   -0.1534    0.0000 C   0  0
    6.8630    0.5569    0.0000 N   0  0
    6.8500   -0.8711    0.0000 O   0  0
    7.6875    0.5494    0.0000 O   0  0
    4.6623   -3.6680    0.0000 C   0  0
    4.6671   -2.8386    0.0000 O   0  0
    3.9395   -4.0706    0.0000 C   0  0
    5.3706   -4.0851    0.0000 O   0  0
    3.2265   -3.6535    0.0000 O   0  0
    3.9299   -4.9000    0.0000 C   0  0
   2  3  1  0
  12 13  1  0
  1  2  2  0
 13 14  1  0
  3  4  2  0
 14 15  1  0
  4  6  1  0
 15 16  2  0
  6  7  1  0
 16 17  1  0
  7  8  2  0
 17 18  2  0
  8  9  1  0
 18 19  1  0
  9  5  1  0
 19 20  2  0
 20 15  1  0
  5  1  1  0
 18 21  1  0
  8 10  1  0
 21 22  2  0
  5  6  2  0
 22 23  1  0
  7 11  1  0
 23 24  1  0
 23 25  2  0
 11 12  1  0
 24 26  1  0
 28 27  2  0
 29 27  1  0
 30 27  1  0
 31 29  1  0
 32 29  1  0
M  END
$$$$

CC(O)C(O)=O.CC1=C(CCNCC2=CC=C(\C=C\C(=O)NO)C=C2)C2=C(N1)C=CC=C2

The web call via POST using this URL http://scitegic.windows.ebi.ac.uk:9955/rest/chembldescriptors returns:

{
  "ACD_LogD7.4": "0.564",
  "HBA": 3,
  "Num_NegativeAtoms": 0,
  "N_heavy_atoms": 26,
  "N_polar_atoms": 5,
  "Molecular_Species": "BASE",
  "RO3_PASS": "N",
  "MED_CHEM_FRIENDLY": "N",
  "RTB": 7,
  "QueriesMapped": [
    "[C;!R](-[C;!R]=O)=[C;!R]"
  ],
  "net_charge": [
    0
  ],
  "HBD": 4,
  "Molecular_Formula": "C21H23N3O2",
  "ACD_LogP": "2.548",
  "num_ro5_violations": 0,
  "PSA": 77.15,
  "Molecular_Weight": 349.42622,
  "RemovedSalts": "lactate stereo",
  "ALogP": 3.194,
  "ACD_MOST_APKA": "8.706",
  "INORGANIC": false,
  "ACD_MOST_BPKA": "9.295",
  "Num_PositiveAtoms": 0
}

So there is an error in one key: ACD_LogD7.4, should be ACD_LogD

ndedman commented 8 years ago

The 7.4 refers to the pH (Physiological) the calculation is performed at, so it's not an error. I'll need to check other protocols that Anne uses just in case it's referenced from the ACD component, then if it's not referenced (or at least renamed) we can rename it here.

mnowotka commented 8 years ago

Hmmm, strange - my code expects "ACD_LogD" and I'm not sure if it always fails... Is it possible that in some cases "ACD_LogD" is returned and in other cases "ACD_LogD7.4"? Anyway, we will check that tomorrow.

ndedman commented 8 years ago

Just checking the protocol and I found this Pilotscript I'd added already:

/* This is needed as there are 6 compounds that output acd_logd as nan these are 439951,499790,323317,323345,11922,58671,72457 */
rename('ACD_LogD7.4', 'ACD_LogD');
If  ACD_LogD is not defined or acd_logD='nan'
then ACD_LogD:='';
end if;

It's also the last component before the pipeline exits. I will debug.

ndedman commented 8 years ago

I've moved the component. Seems to be working now:

{
  "HBD": 4,
  "HBA": 3,
  "Num_NegativeAtoms": 0,
  "N_heavy_atoms": 26,
  "N_polar_atoms": 5,
  "Molecular_Species": "BASE",
  "RO3_PASS": "N",
  "MED_CHEM_FRIENDLY": "N",
  "RTB": 7,
  "ACD_LogD": "0.564",
  "QueriesMapped": [
    "[C;!R](-[C;!R]=O)=[C;!R]"
  ],
  "net_charge": [
    0
  ],
  "Molecular_Formula": "C21H23N3O2",
  "ACD_LogP": "2.548",
  "num_ro5_violations": 0,
  "PSA": 77.15,
  "Molecular_Weight": 349.42622,
  "RemovedSalts": "lactate stereo",
  "ALogP": 3.194,
  "ACD_MOST_APKA": "8.706",
  "INORGANIC": false,
  "ACD_MOST_BPKA": "9.295",
  "Num_PositiveAtoms": 0
}