uniprot / enzymeportal

The EBI Enzyme Portal
http://www.ebi.ac.uk/enzymeportal/
Apache License 2.0
11 stars 4 forks source link

Unexpected search results #35

Closed rafael-alcantara closed 12 years ago

rafael-alcantara commented 12 years ago

From some comments by Kristian Axelsen:

''I searched for "P19456" (AC for PMA2_ARATH, EC 3.6.3.6) but the only result was a structure for "Paraneoplastic antigen Ma2 [Human]" [[BR]] When I search for "EC 3.6.3.6" I get 14 hits, but PMA2_ARATH is not among them. I would also have expected at least 33 hits, since this is the number of UniProtKB entries annotated with this EC number. I still get the hit from the previous search which has nothing to do with an ATPase. [[BR]] When I search for "Q13423" I get 4 hits, even though it is the AC of only one entry.''

rafael-alcantara commented 12 years ago

Author: ralcantara Additional comment by CS: I was searching the dev version of EnzymePortal for "modafinil" but got "no results found". The compound is definitely in ChEMLB and has four enzyme target listed there.

rafael-alcantara commented 12 years ago

Author: ralcantara "modafinil" is not found because ChEMBL is not one of the domains searched in EB-Eye (to be added).

rafael-alcantara commented 12 years ago

Author: ralcantara More comments by PdM:

  1. Searching for cytochrome P450 in EP produces the following as first hit: Putative cytochrome P450 YjiB [Bacillus subtilis]

In UniProt (searching including ec:*) this is not on the first page and its not on the first page of the EB-eye search? Why such a high ranking for this?

  1. Searching for our example "REACT_1400.4" I get 3 hits back. However, I can't find a single combination of species displaying a pathway in this reaction.... is this because it references the reaction and not the pathway?

However if I search for " REACT_8680.2" which is the catalyst in Reactome I get 0 results?

rafael-alcantara commented 12 years ago

Author: ralcantara Some comments on KAx report:

P19456: the enzyme portal groups enzymes from different species in one search result. The "default" one is human, or a different species only when there is no such enzyme recorded in humans. In this case, there is ( Q9UL42) and it is shown in the first line, but Mouse ear cress (P19456) is shown below (Species:...). That said, I cannot understand how come that paraneoplastic antigen Ma2 was related to P19456. I will investigate that.

Filtering for Arabidopsis does nothing apparently, but the only result already "contains" both species, as explained above. We are working to replace the first link (human, by default) with the species selected in the filter (ticket:52) which is what the user expected.

rafael-alcantara commented 12 years ago

Author: ralcantara P19456: the search returns just one entry for UniProt (PMA2ARATH). The problem comes when trying to get the orthologs from the UniProt web service: querying for PMA2* returns also PNMA2_* results (also in UniProt website). I have asked the UniProt team for help about this.

rafael-alcantara commented 12 years ago

Author: ralcantara "EC 3.6.3.6": actually there are 33 results from UniProt, but these are grouped as orthologs. The Arabidopsis accession appears there, but it is 'hidden' as previously explained (you should look for the 'Mouse-ear cress' link).

rafael-alcantara commented 12 years ago

Author: ralcantara "Q13423": this text search returns four UniProt entries: NNTM_HUMAN (the actual Q13423) O96789_STRPU, Q9KM25_VIBCH and Q82QU2_STRAW which include a related 3D structure (HSSP built from PDB template 1DJL based on UniProtKB Q13423).

As this is a pure text search and we don't discriminate the fields to search (accession in this case) we can run into unwanted results as in this case unless we make the search really clever.

rafael-alcantara commented 12 years ago

Author: ralcantara "P19456": searching UniProt for "PMA2" returns also PNMA2 because these PNMA2* entries were initially PMA2* (see [http://www.uniprot.org/uniprot/Q9UL42?version=* PNMA2_HUMAN history] for example). The mnemonics in the history are included in the search - as reported by E. Gasteiger - which inevitably results in these extra results.

rafael-alcantara commented 12 years ago

Author: ralcantara ChEMBL is now searched, so 'modafinil' is found and we get five results.

Now we should add ChEMBL compounds to the small molecules tab, as modafinil is not shown there (only compounds mentioned in UniProt). However, the number of ChEMBL compounds cross referenced for some entries (ex. O00519) is very high (827). How should we handle these cases?

rafael-alcantara commented 12 years ago

Author: ralcantara Comment by Jules: Do you know why, when searching for the term 'beta-lactamase' and then filtering by species for 'human' I get CD48 antigen and YTH domain family protein 1, neither of which has catalytic activity?

rafael-alcantara commented 12 years ago

Author: ralcantara Works for me:

Fixed:

Won't fix: