openphacts / GLOBAL

Global project issues [private for now. owner lee harland]
3 stars 0 forks source link

Chemspider/Ops-rsc provenance #98

Open danidi opened 10 years ago

danidi commented 10 years ago

From https://openphacts2011.atlassian.net/browse/CS-95: Currently the molecular properties which are provided by Chemspider have ops-rsc as provenance. Is this actually correct?

Antony John Williams added a comment - 05/Feb/14 8:33 PM Assuming that you mean PREDICTED properties from ACD/Labs then I believe that ChemSpider is the correct provenance as the predicted properties only exist on ChemSpider as exposed public properties. I agree that the same tool might have been applied to the same molecule either in an organization or even on the web if the same version, settings etc was in place. I also agree that in theory we could give the provenance to ACD/Labs PhysChem batch version XX.XX but in reality the properties as predicted don't exist in the product...they are not lookups...they are algorithms.

Daniela Digles added a comment - 05/Mar/14 5:41 PM My issue here is that previously we said the provenance for the properties was chemspider, and when you linked out to the chemspider molecule you could see the values there. Now is given as provenance, but if you follow the link to a given compound you don't find the original information there.

Colin Batchelor added a comment - 06/Mar/14 10:48 AM Hello Daniela,

Could you indicate in which file you saw this so we can better understand the problem?

Best wishes, Colin.

Daniela Digles added a comment - 06/Mar/14 11:15 AM For example in the API results for Compound Information (see below for the results of http://ops.rsc.org/OPS568812). I don't know from which file this information came originally.

"primaryTopic": { "_about": "http://ops.rsc.org/OPS568812", "inDataset": "http://ops.rsc.org", "hba": 8, "hbd": 1, "inchi": "InChI=1S/C27H31N7OS/c1-5-6-11-24-28-18(2)23(16-25(36)33(3)4)27(35)34(24)17-19-12-14-20(15-13-19)21-9-7-8-10-22(21)26-29-31-32-30-26/h7-10,12-15H,5-6,11,16-17H2,1-4H3,(H,29,30,31,32)", "inchikey": "AMEROGPZOLAFBN-UHFFFAOYSA-N", "logp": 4.518, "molformula": "C27H31N7OS", "molweight": 501.646, "psa": 122.46, "ro5_violations": 1, "rtb": 9, "smiles": "CCCCC1=NC(=C(C(=O)N1CC2=CC=C(C=C2)C3=CC=CC=C3C4=NN=NN4)CC(=S)N(C)C)C",

Colin Batchelor added a comment - 06/Mar/14 11:33 AM Hi Daniela,

That's much clearer, thank you! I'm not sure how that output from the API is actually generated, though. Is this the Amsterdam people?

Colin.

Daniela Digles added a comment - 06/Mar/14 11:40 AM Hi, yes. Maybe either [~pgroth] or [~antonis] can comment on it.

Cheers, Daniela

Paul Groth added a comment - 06/Mar/14 11:51 AM We list the data source (not the individual file) where we got the data directly in this call.

We are looking at updating the Data Sources api method to map from the graph name to its corresponding void descriptor.

So the path should be inDataset -> http://ops.rsc.org call Data Sources look up void descriptor.

I'm changing the issue to be on us.

karapetk commented 10 years ago

perhaps RSC label for this issue should be removed then? @valt