broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
160 stars 87 forks source link

shotgun sequencing Data compatibility #15

Open Pawar2018 opened 6 years ago

Pawar2018 commented 6 years ago

Hi, We have sequenced data (from Ion Torrent) for Low-pass shotgun sequencing. Is shotgun sequencing data compatibility to run ichorCNA to estimate the tumor fractionas as well as copy number identification. If yes, can you please guide me how to create a baseline and how many normal (healthy) samples are required for the same.

Regards Krunal

gavinha commented 6 years ago

Hi Krunal,

You should be able to use ichorCNA for your data. You do not need matched normals to analyze your samples. If you wish to use a panel of normals to help a bit with normalization, you can refer to the wiki: https://github.com/broadinstitute/ichorCNA/wiki/Create-Panel-of-Normals

Best, Gavin

Pawar2018 commented 6 years ago

Dear Gavin, Thanks for the reply. I followed the wiki to generate PoN and WIG files. We have following queries in respect of output files:

  1. Are 15 normal samples sufficient or do we need to increase the normal sample count?
  2. Output file (.cna.seg) has multiple CNAs, would like to know how can we filter them for accurate true positive results?

Regards Krunal

gavinha commented 6 years ago

Hi Krunal,

  1. 15 normal samples is a good start. The way that the PoN is used isn't sophisticated at the moment. More samples could help but it may not make a dramatic difference.

  2. Are you wondering whether the predicted CNAs are false positives and how you should filter them? It's a little bit difficult to answer this question because I am not sure what are FPs in your data. Do you have an example (i.e. plots)?

Pawar2018 commented 6 years ago

Hi Gavin, Thanks, Is there any way to identify our baseline (healthy samples) data properly normalized or not, as well as would like to know is there any graph or any additional file to find the normalization status.

Please check the below link as requested in respect of CNA FPs. Sample_File_IchorCNA.pdf

Regards Krunal

Pawar2018 commented 6 years ago

Hi Gavin,

Is there any way to identify our baseline (healthy samples) data properly normalized or not, as well as would like to know is there any graph or any additional file to find the normalization status.

Please check the attached PDF as requested in respect of CNA FPs.

waiting for the reply.

Regards Krunal

On Thursday 29 March 2018 08:52 PM, Gavin Ha wrote:

Hi Krunal,

1.

15 normal samples is a good start. The way that the PoN is used
isn't sophisticated at the moment. More samples could help but it
may not make a dramatic difference.

2.

Are you wondering whether the predicted CNAs are false positives
and how you should filter them? It's a little bit difficult to
answer this question because I am not sure what are FPs in your
data. Do you have an example (i.e. plots)?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/ichorCNA/issues/15#issuecomment-377271349, or mute the thread https://github.com/notifications/unsubscribe-auth/AjdSTVho4hcjSBvNU9iApaXWNbkafKfHks5tjPxEgaJpZM4S8ZdL.

-- Regards Krunal Pawar Bioinformatics Department Datar Genetics Limited

gavinha commented 6 years ago

Hi Krunal,

Sorry for the delay.

Looking at the profile in the PDF that you attached, it looks like there are clearly copy number alterations. One thing I can bring up is that perhaps it is a female sample because chromosome X is at the same level as other chromosomes and that this sample may have experienced genome double (ploidy should be higher). In this case, I'm not sure if the technical data normalization is the problem. I think it may have to do with the actual signals in your sample. Is this sample definitely from a healthy donor or matched normal sample?

Your question regarding proper normalization - the best way is to test ichorCNA on your data on more expected healthy/normal samples.

Hope this helps, Gavin