broadinstitute / depmap_omics

What you need to process the Quarterly DepMap-Omics releases from Terra
https://depmap.org/portal/
108 stars 22 forks source link

What is the pseudo normal used to call mutations in cell lines? #24

Closed ytakemon closed 3 years ago

ytakemon commented 3 years ago

Hello,

I'm trying to process WGS data from a cell line similar to how your pipeline at DepMap. I noticed in your workflow doc there was a note under the mutations slide that "this pipeline requires a matched normal, so we use a pseudo normal for all cell lines samples". Could you explain what this pseudo normal is and would it be possible for you to share this data with me?

Thank you

javadnoorb commented 3 years ago

Hi,

For our Agilent WES samples we use: gs://firecloud-tcga-open-access/tutorial/bams/C835.HCC1143_BL.4.bam I think this is publicly available.

For our ICE WES samples we use a germline blood from the CCLF project's samples: gs://fc-38a1a377-72c6-4e90-917f-e4bb709b8f2c/CCLF_RCRF1009-Normal-SM-F3R8L/seq_data_v2/CCLF_RCRF1009GL.bam I'm not sure if this is publicly available, but give it a shot and let me know.

For WGS we use this GTEx sample: GTEX-111FC-0001-SM-6WBTJ I think you'd need dbGaP access for this one and get it through the GTEx project.

@jkobject feel free to chime in if I'm missing some information

ytakemon commented 3 years ago

Thanks for the quick response!

It looks like I am not able to access CCLF data, but that okay since I'm only interested in the WGS data at the moment. While trying to figure it out on my own I actually duplicated my question on the DepMap community forum. Sorry about this.

Now having re-read the Ghandi et al (2019) paper again, I see that a pseudo normal (ie panel-of-normal, PoNs) were created via 8,000 TCGA normal samples. Is that the same as the GTEx GTEX-111FC-0001-SM-6WBTJ you are referring to?

javadnoorb commented 3 years ago

Yes. I would recommend posting in the forum for future questions.

Psuedonormal is a single sample that we use as a 'normal' because our cell lines don't have paired normal but our mutation caller needs it.

Panel of normals(pon) is what you're referring to according to Ghandi et al. That's made from a larger set of normals. These require dbGaP access so unfortunately we wouldn't be able to share them.

On Fri, Apr 23, 2021, 12:10 PM Yuka Takemon @.***> wrote:

Thanks for the quick response!

While trying to figure it out on my own I actually duplicated my question on the DepMap community forum https://forum.depmap.org/t/panel-of-normal-data-for-variant-calling/576. Sorry about this.

Now having re-read the Ghandi et al (2019) paper again, I see that a pseudo normal (ie panel-of-normal, PoNs) were created via 8,000 TCGA normal samples. Is that the same as the GTEx GTEX-111FC-0001-SM-6WBTJ you are referring to?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/depmap_omics/issues/24#issuecomment-825762684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJBUETONIZTGGTQCJUJI53TKGLWVANCNFSM43NOMRUA .

ytakemon commented 3 years ago

I see. Thank you for clarifying what the psuedo-normal is and where the PoNs originated. For the PoN, are you able to share the sample IDs were and what were the criteria used to select them (besides that they were normal tissue)? I will see if I can request them from GTEx biobank.

javadnoorb commented 3 years ago

closing this as it can be followed here.