SWISS-MODEL / covid-19-Annotations-on-Structures

Mapping sequence data onto structures for the Covid-19 Biohackathon April 2020
https://github.com/virtual-biohackathons/covid-19-bh20/wiki/Annotations-on-Structures
MIT License
2 stars 8 forks source link

Extract annotations from literature #12

Open gtauriello opened 4 years ago

gtauriello commented 4 years ago

Idea is to go beyond the UniProt-annotations which we already display. E.g. for PL-PRO of SARS-CoV we found annotations in the literature (see Fig. 1 of Báez-Santos et al 2015) which go way beyond what we find in the UniProt-annotations.

lnblum commented 4 years ago

Here is the collaborative google sheet to add new sources and update progress in our search for new annotations to upload to SWISS-MODEL. https://docs.google.com/spreadsheets/d/1MORY19ZXZceyS0xJYey_k-8g-tH2ceaCmxpPyceSUHs/edit?usp=sharing

gtauriello commented 4 years ago

@all-contributors please add @lnblum for content

allcontributors[bot] commented 4 years ago

@gtauriello

I've put up a pull request to add @lnblum! :tada:

KarelBerka commented 4 years ago

Would it be interesting to gather SIFTS from PDBe-KB? https://www.ebi.ac.uk/pdbe/covid-19 e.g. from https://www.ebi.ac.uk/pdbe/pdbe-kb/covid19/P0DTD1 ?

gtauriello commented 4 years ago

@KarelBerka yes this could make sense for interaction interfaces. Mainly so we can see information from one structure mapped on another one. It probably fits better in #8 though.

We do use SIFTS already in SWISS-MODEL to map PDB structures to the UniProt sequences but we don't extract any features/annotations from PDBe-KB (yet). We are in touch with the PDBe-KB team for other collaborations and I also contacted them last week so we can fix the mappings for the polyproteins. The problem there is that P0DTC1 is almost fully included in P0DTD1 (except a small peptide called nsp11) and hence any annotation for P0DTC1 should also be mapped on P0DTD1.

gtauriello commented 4 years ago

The interaction between the spike-protein and TMPRSS just came up on my radar. Seems relevant for the virus entry into the host cell (see also here) and looks like something where structural data can help...but we could need help in scanning the literature for it. Maybe IntAct will have more in future releases but otherwise, these two papers might be good entry points: https://www.biorxiv.org/content/10.1101/2020.02.08.926006v3.full and https://doi.org/10.1016/j.cell.2020.02.052.