mims-harvard / TDC

Therapeutics Commons (TDC-2): Multimodal Foundation for Therapeutic Science
https://tdcommons.ai
MIT License
1.02k stars 174 forks source link

New Dataset: String - PPI #115

Open kexinhuang12345 opened 3 years ago

kexinhuang12345 commented 3 years ago

Describe the problem String database has a large number of known PPIs. We currently have HuRI, but would love to add this to our PPI task as well. This would be a relatively simple task for new user to get acquainted with TDC. We would need the protein ID and the protein sequence, which may require some mapping.

The dataset can be found at https://string-db.org/.

Describe the solution you'd like

from tdc.multi_pred import PPI

data = PPI(name = 'String', path = './data')

Additional context N/A

abearab commented 1 month ago

Related Data Suggestion:

"Computing the Human Interactome" – https://www.biorxiv.org/content/10.1101/2024.10.01.615885v1.full

Download Data http://prodata.swmed.edu/humanPPI/bulk_download

cc @kexinhuang12345