UCLouvain-CBIO / depmap

Cancer Dependency Map package
https://uclouvain-cbio.github.io/depmap/
24 stars 7 forks source link

Consider adding "essential" gene designations data #75

Closed j-andrews7 closed 2 years ago

j-andrews7 commented 2 years ago

The CRISPR_common_essentials.csv list is very helpful for CRISPR dependency screens to remove pan-lethal hits. Inclusion of this list in future releases would be very welcome and a relatively simple addition.

lgatto commented 2 years ago

Thank you for the suggestion!

j-andrews7 commented 2 years ago

A closer look at this revealed that the essentiality data shown on the site is actually not available from the download page and only available through the hidden API (https://depmap.org/portal/api/download/gene_dep_summary).

This is pretty frustrating, as I have not been able to determine how to pull it down for specific releases, only for the most recent one. I don't necessarily expect the package to pull/deal with this, but it's useful info to have listed somewhere.

tfkillian commented 2 years ago

Thank you for the suggestion, @j-andrews7. Do you think that we should add this dataset as its own file, or append each omics data (crispr, tpm, etc) with the data contained in this file?

j-andrews7 commented 2 years ago

I feel just providing it on its own and letting end-users determine how to filter the other files as necessary is likely sufficient. The API link above only pulls data for the latest release, more details can be found in this depmap forum thread.

tfkillian commented 2 years ago

A new essential gene dataset has been made available. This dataset has been given the designation gene_summary_22Q1 and has the EH number EH7529 and can also be downloaded using its own accessor function, depmap_gene_summary().

j-andrews7 commented 2 years ago

This function has the same issue as #79 did.

3: download failed
  hub path: ‘https://experimenthub.bioconductor.org/fetch/7579’
  cache resource: ‘EH7529 : 7579’
tfkillian commented 2 years ago

@j-andrews7 i will sort this out with the EH team.

tfkillian commented 2 years ago

@j-andrews7 I am able to successfully access gene_summary_22Q1 both by selecting this dataset by the EH number 7529 and by using the depmap_gene_summary function. What version of R and Bioconductor are you using?

j-andrews7 commented 2 years ago

Tried again and it worked. 🤷‍♂️

Thanks for taking a look.