openphacts / GLOBAL

Global project issues [private for now. owner lee harland]
3 stars 0 forks source link

Explorer: TSV Export - Target Organism contains target uri #237

Closed danidi closed 9 years ago

danidi commented 9 years ago

In the export of compound pharmacology (tested for Aspirin), the target organism column (10th column) contains an array with the target organism and the uri of the target itself. e.g. [Homo sapiens, https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL390]. It should include the target organism only.

ianwdunlop commented 9 years ago

Yes. I think this was because at one point the api claimed there could be multiple organisms (maybe)....

ianwdunlop commented 9 years ago

The 1.4 API docs state that the hasTarget block can be a single value or an array so what would be the best way to display them in the TSV download?

ianwdunlop commented 9 years ago

The target organisms will be separated by a comma if there is more than one. Otherwise just the single organism.

danidi commented 9 years ago

I think you can have several target components for some targets. Does the export include both the main target and the target components? Maybe in these cases, each component is connected to an organism? Otherwise, I would find it strange to have several organisms returned for a single target.

ianwdunlop commented 9 years ago

I don't think the code is doing anything with the Target Components and only the target and organism name. Could do with an example which has target components. Trying to reverse engineer the response format from the API docs is proving to be a bit tricky. Maybe we should open another issue so we don't forget this.

danidi commented 9 years ago

You could try target info and pharmacology with http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2093866. This one is a protein family entry with two components. Would be probably worthwhile to show the target type information as well.

Chris-Evelo commented 9 years ago

If the target is a single species specific protein, then yes you should not find multiple organisms for a single target (whether targeted by multiple organisms or not), it the target is something more general, like a protein class or even an organ then it would make sense that you can find that same target in multiple organisms. Even if we currently only look for single, species specific proteins there is no need to limit it. It will just make the system more future proof I think.

danidi commented 9 years ago

I agree, and we even show activities against target families from chembl already. But still, for the pharmacology data, would you then for a specific activity value expect several organisms? There is a distinction between assay and target organism already. So could you still have several (different) target organisms? Maybe @agaulton knows if we could have such cases in ChEMBL?

ianwdunlop commented 9 years ago

Opened https://github.com/openphacts/GLOBAL/issues/239