Closed miachom closed 1 year ago
Hi,
Interval files are public and can be found at gs://ccleparams/references/PureCN_intervals. The reference genome we use is hg38.
Best, Simone
@5im1z Hi Simone, sorry, but I still cannot find this link. Could you please post a functional link for these reference files?
Best, Mingkee
Hi Mingkee,
If you are trying to pull it in your browser, this is the link: https://console.cloud.google.com/storage/browser/ccleparams/references/PureCN_intervals;tab=objects?prefix=&forceOnObjectsSortingFiltering=false. You might have to log in using your google credentials. Let me know if it doesn't work!
Simone
Hi Simone,
It works; thanks for the link!
Best, Mingkee
Hi Simone,
Is this reference also available publicly /Data/VCFs/Liftover/hg38.fa
? I would like to get this .fa file as well and I can't seem to find it in google buckets. I tried to use our hg38 reference along with the ccle params files such as agilent_hg38_lifted_chrXY.no_header.bed
and agilent_hg38_intervals.txt
. But it's throwing me errors for not being able to parse.
Thanks. Mingkee
Hi Mingkee,
We use gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta as our Hg38 reference fasta. If you need to pull it from the browser, you can access it through this folder which is hosted by GATK.
Thanks, Simone
Hi Simone,
I read in one of the announcements for mutation pipeline updates here https://forum.depmap.org/t/announcing-the-22q4-release/2125 that to run Mutect2, we don't need to use bait sets anymore. In such a case is running Mutect2 on CCLE cell lines for exome data without bait set alright? And if not, where can I find these files agilent_hg38_lifted_chrXY.no_header.bed
and agilent_hg38_lifted_chrXX.no_header.bed
? At the moment, I can see only for agilent hg19 and ice hg19.
Thanks for all of your responses!
Best, Mingkee
Hi Mingkee,
It is correct that we are no longer using interval sets for exome mutation calls. If you need the interval files, they are stored in our public bucket gs://ccleparams where we share all of our reference files. Browser-friendly link here: https://console.cloud.google.com/storage/browser/ccleparams/references/intervals. And in case you are not aware, if you are interested in getting mutect2 calls for CCLE lines, they (among other things) can be found in our public workspace.
Simone
Hi,
I am interested in having the reference and baits files used for exome data of PureCN pipeline here https://github.com/broadinstitute/depmap_omics/blob/8e1a8b553b65b2f40ed3a8396f1a4c4275932e07/WGS_pipeline/PureCN_pipeline/README.md?plain=1#L11
How can I get access to this data? Or is this publicly available somewhere? Thank you