rdocking / fusebench

A workbench for aggregation and interpretation of RNA-Seq gene fusions
Other
5 stars 1 forks source link

Document annotation source: COSMIC #29

Closed rdocking closed 7 years ago

rdocking commented 7 years ago

Please answer the following questions about the COSMIC data source:

In general, what kinds of information are in this data source? The answers should be uploaded to the wiki in an appropriate spot.

wilcas commented 7 years ago

I'll try to answer at least one of these questions in around 10 minutes, but anyone should feel free to take it over when I have to step out for my test.

wilcas commented 7 years ago

Online Location

COSMIC Cell Lines Database

API

It looks like you can submit HTML GET requests to COSMIC of the form

http://cancer.sanger.ac.uk/api/ga4gh/beacon?format=json&ref=?&dataset=cosmic&allele=?&pos=?&chrom=?

You specify a response format, the reference genome by number (for example GrCH38 would be specified with ref=38), the dataset (i.e. dataset=cosmic), the allele, position, and chrom. Essentially just replace the question marks above. More information about querying the COSMIC database can be found at their BEACON Project page.

Given that you can specify output as a JSON, this should be pretty straightforward to work with in python using the requests and json modules, both of which are in the "standard" library.

Bulk Download

See Complete Fusion Export

rdocking commented 7 years ago

I've summarized these comments on the 'Data Sets' page of the wiki - thanks @wilcas !