RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
37 stars 8 forks source link

request to ETL EBI STRING into KG2 #31

Open saramsey opened 4 years ago

saramsey commented 4 years ago

From Will Byrd, Team Unsecret Agent:

Hi Steve!

I'm getting lots of requests for STRING from biologists. Do you think it would be possible to bring in STRING? I get the feeling this would be extremely useful. Unless there is another KP that already provides STRING.

Thanks!

--Will

saramsey commented 4 years ago

STRING has a download page: https://string-db.org/cgi/download.pl?sessionId=Ueyi1zDD8Zf0

ecwood commented 3 years ago

Based on the schema (https://stringdb-static.org/download/database.schema.v11.0.pdf), it appears that STRING has lots of edge publications and data scores, so this is likely worth ETLing. @saramsey Would you like me to reach out to UAB and ask which data they are most interested in? Also, there are species specific dumps, if we only care about human data, which make the downloads quite a bit smaller.

saramsey commented 3 years ago

Based on the schema (https://stringdb-static.org/download/database.schema.v11.0.pdf), it appears that STRING has lots of edge publications and data scores, so this is likely worth ETLing. @saramsey Would you like me to reach out to UAB and ask which data they are most interested in? Also, there are species specific dumps, if we only care about human data, which make the downloads quite a bit smaller.

Yes, please do, and if you could report back in the issue here, that would be great. Thank you for investigating this.

ecwood commented 3 years ago

Here was the response from Michael Patton:

Hi there -- definitely the human stringDB as priority! but there are obvious benefits to mouse, rat, zebrafish, pig, worm proteomic/stringDB data sources as they are model systems in many scientific experiments.