Open ChristineChichester opened 10 years ago
Both http://identifiers.org/ensembl/ENSG00000265969 and http://identifiers.org/ensembl/ENSG00000186318 map to a HIGH number of uniport URIs.
MapURL does work with the genes
It seems that http://identifiers.org/ensembl/ENSG00000186318 is mapped to http://www.uniprot.org/uniprot/P56817 which is mapped to Chembl. But with the ensembl URI you don't get a Chembl URI. Should we see this transitive mapping with mapURI? Or are the mappings not calculated when there are too many links to uniprot?
The mapping between http://identifiers.org/ensembl/ENSG00000186318 and http://www.uniprot.org/uniprot/P56817 comes from the Human base ensembl mapping set with the 1 to VERY man issue.
I an (failed) attempt to solve the one to many issues I appear to have given that a different justification which prevented the transitives.
http://identifiers.org/ensembl/ENSG00000186318 is not in the small replacement mapping set we have so at best will soon only be available in the all lens.
http://identifiers.org/ensembl/ENSG00000186318 is still missing on develop with the new IMS.
The mapping is there, but target pharmacology does not return any data. The corresponding protein (http://www.uniprot.org/uniprot/P56817) does.
It seems the mapping from ensembl is missing chembl, which shows up if you ask for uniprot.
@danidi I see from @stain's example above that the gene in question maps to multiple proteins. I suspect that @Christian-B put some logic into the IMS to ignore either linksets with multiple targets or did not name trembl as a suitable intermediary for transitives.
As I left the project: There is no logic to ignore links based on the number of targets/mappings.
The intermediary for transitive are lens dependent. According to the latest version listed at http://openphacts.cs.man.ac.uk tembl is an intermediary. See: http://openphacts.cs.man.ac.uk:9095/QueryExpander/Lens
Replace http://openphacts.cs.man.ac.uk:9095 with the URL of the IMS you are using.
According to http://ops2.few.vu.nl/QueryExpander/Lens, Ensembl and Uniprot are both allowed middle sources in the default lens.
Just to summarize the issue (tested with http://ops2.few.vu.nl/QueryExpander/BridgeDb): http://identifiers.org/ensembl/ENSG00000186318 finds http://www.uniprot.org/uniprot/P56817 (and several other uniprot URIs) in http://ops2.few.vu.nl/QueryExpander/mappingSet/81 (justification SIO_000985 protein coding gene, predicate INFERRED_FROM_TRANSLATION), but no mapping to ChEMBL.
http://www.uniprot.org/uniprot/P56817 finds CHEMBL_TC_3139 in http://ops2.few.vu.nl/QueryExpander/mappingSet/4 (justification SIO_010043 protein, predicate exactMatch), which in turn finds CHEMBL4822 in http://ops2.few.vu.nl/QueryExpander/mappingSet/2 (justification SIO_010043 protein, predicate exactMatch).
Going the other way round (starting with http://linkedchemistry.info/chembl/target/tCHEMBL4822), we can find URIs from Ensembl (e.g. ENST00000313005 via http://ops2.few.vu.nl/QueryExpander/mappingSet/11), but this is the old Ensembl linkset, not the one provided by @JonathanMELIUS.
Is there something else necessary, to allow Jonathan's linksets as middle sources? Is the predicate important for the transitives? Or is it a problem, that now we have to linksets between ensembl and uniprot at the same time?
Predicate could be important if they are different as the system would need to work out the new predicate.
As I left it this was done by https://github.com/bridgedb/BridgeDb/blob/master/org.bridgedb.uri.sql/src/org/bridgedb/sql/predicate/LoosePredicateMaker.java
Justification is also important. Again as I left it the combiner was: https://github.com/bridgedb/BridgeDb/blob/master/org.bridgedb.uri.sql/src/org/bridgedb/sql/justification/OpsJustificationMaker.java
I also notice that not all the chembl target linksets are in the default lens!
Compare http://ops2.few.vu.nl/QueryExpander/SourceTargetInfos?sourceCode=Chembl16TargetComponent&lensUri=All http://ops2.few.vu.nl/QueryExpander/SourceTargetInfos?sourceCode=Chembl16TargetComponent
Is this intentional? Remember that linkset presence in lens depends on the justifications. See: ops2.few.vu.nl/QueryExpander/Lens
Yes, this is intentional. The ChEMBL linkset was split up to have only the single protein mappings in the default lens. The others are complexes or target groups. We might actually need a dedicated lens for those.
Is the current predicate maker.
http://ops2.few.vu.nl/QueryExpander/mappingSet/81 Which has predicate http://rdf.ebi.ac.uk/terms/ensembl/INFERRED_FROM_TRANSLATION
Will not be allowed in any transitive except with other linksets that have the same predicate.
This is because the IMS has not be told what predicate to use when it find one link http://rdf.ebi.ac.uk/terms/ensembl/INFERRED_FROM_TRANSLATION and one for example http://www.w3.org/2004/02/skos/core#exactMatch
The Fix is to expand LoosePredicateMaker.java
From MapURL
"primaryTopic": { "_about": "http://www.uniprot.org/uniprot/P56817", "exactMatch": [ "http://identifiers.org/ensembl/ENSG00000265969", "http://identifiers.org/ensembl/ENST00000428381", "http://identifiers.org/ensembl/ENST00000313005", "http://identifiers.org/ensembl/ENST00000445823", "http://identifiers.org/ensembl/ENST00000513780", "http://identifiers.org/ensembl/ENSG00000186318" ] http://identifiers.org/ensembl/ENSG00000265969 gives 404 from Target Pharm http://identifiers.org/ensembl/ENST00000428381 works http://identifiers.org/ensembl/ENST00000313005 works http://identifiers.org/ensembl/ENST00000445823 works http://identifiers.org/ensembl/ENST00000513780 works http://identifiers.org/ensembl/ENSG00000186318 gives 404
Ensembl Gene IDs do not seem to work in Target Pharmacology although they appear to be mapped to the protein.