Knowledge-Graph-Hub / kg-covid-19

An instance of KG Hub to produce a knowledge graph for COVID-19 response.
https://github.com/Knowledge-Graph-Hub/kg-covid-19/wiki
BSD 3-Clause "New" or "Revised" License
80 stars 26 forks source link

ingest SARS-CoV-2 structural data table #186

Open realmarcin opened 4 years ago

realmarcin commented 4 years ago

From: https://covid19.bioreproducibility.org/

Some key pieces would be:

lpalbou commented 4 years ago

IMO, the easiest reusable dataset for that study would be the official drug-pdb mapping: https://www.rcsb.org/pdb/ligand/drugMapping.do

It details the structures of targets and targets + drug.

Most structures are only fragments and a number are not human proteins. In terms of quality, the general criteria are:

We could check is the side chains are entirely resolved but I don't recommend going down that route unless there is a reason.

If we want more human proteins (and longer fragments/coverages), I recommend using Swiss model (17k human proteins modeled): https://swissmodel.expasy.org/repository . I also used to work with them so we have a few contacts there.

If you want more details on structures, I still have somewhere my code (https://academic.oup.com/nar/article/39/1/30/2409207) that details all entity types and all types of interactions in each structure.

All meta data shown on the PDB site (including mapping with uniprot) is also available in their Data API: https://www.rcsb.org/pages/webservices. I suppose that data is also available on their ftp or we could contact them.

justaddcoffee commented 4 years ago

This ticket might be a duplicate of https://github.com/Knowledge-Graph-Hub/kg-covid-19/issues/188

lpalbou commented 4 years ago

Other interesting databases:

And I always use in conjunction SIDER and DrugBank.

justaddcoffee commented 4 years ago

TTD (Therapeutic Target Database)

FWIW we have TTD ingested already here, others look interesting.

STITCH is possibly lower priority since we have STRING already