broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
160 stars 87 forks source link

Confusion about light green or dark green lines/dots #43

Closed peneder closed 5 years ago

peneder commented 5 years ago

Hi! In some of the plots that I generated using ichorCNA, some regions are marked as "3 copies" (brown dots) or "1 copy" (dark (!) green dots) and have a bright green line. I assume this means that the subclonal fraction of the tumor is predicted to have 3 or 1 copies of this segment, respectively, whereas the rest of the tumor has 2 copies. My confusion comes from the following: Sometimes I have observed regions that have bright green dots with a bright green line. What does it mean? Sometimes the different sub-solutions show the same segment either with dark green dots and a bright green line, or with bright green dots and a bright green line. Similarly, sometimes they show a segment either with dark green dots and a dark green line, or with bright green dots and a bright green line. These differences seem to heavily influence the prediction of the tumor fraction in some cases. Could you please explain why that is and maybe add the explanation to the wiki? Thank you! image image image

gavinha commented 5 years ago

Hi @peneder

Thank you for your interest in ichorCNA.

The lines represent segment medians. If it is the same color as the dots, then it is predicted to the clonal. If it is light green, then it is predicted to be subclonal. This is made a bit clearer in the wiki (https://github.com/broadinstitute/ichorCNA/wiki/Output)

There are 2 potential issues with the results you have attached.

  1. If you are using large bin sizes, I would recommend that you use the argument --includeHOMD False. The reason is that large homozygous deletions are typically less likely (unless you have reason to believe that is the case) and that this will help with selecting the appropriate ploidy solution.
  2. This should be a male sample, as can be seen for chrX. For chrX, ichorCNA should re-normalize such that log ratio of zero corresponds to no change in copy number (i.e. it has a baseline of 1 copy). chrX not being re-normalized here is not the expected behavior of ichorCNA. There could be several reasons: a. chrY is not included in the read counts (wig) file so that ichorCNA cannot determine sex/gender. b. the --fracReadsInChrYForMale threshold is not set appropriately for your data. If the fraction of coverage in chrY is higher than this value, then it will be called "male". You can look at the params.txt file to check this fraction for each sample. Then, set this value such that it your samples will meet the threshold. b. you are using a matched normal sample that is female. c. the sex/gender of the samples in the panel of normals are also called incorrectly. You should take a look at the chrX values in the PoN to make sure that they have log ratio close to 0.
peneder commented 5 years ago

Hi! Thanks for the answer and sorry for the long silence on my side. It would be great if you could help me with these two further questions.

  1. You said that light green lines represents subclonality. But what do light green dots mean? They are not mentioned in the wiki. Is it possible that they represent homozygous deletions?

  2. I cannot figure out why the X chromosome is not placed at the log ratio of zero. The readcounts countain X and Y chromosomes. Interestingly, when I check the params.txt files for the male samples, they are all recognized as male. I am using the panel of normals that comes with ichor (/inst/extdata/HD_ULP_PoN_hg38_500kb_median_normAutosome_median.rds". Is this panel not compatible to use with male samples? I also tried using no normalization at all, but that did not give adequate results, as did using a single negative control sample. Thanks for your help!

gavinha commented 5 years ago

Hi @peneder

  1. You said that light green lines represents subclonality. But what do light green dots mean? They are not mentioned in the wiki. Is it possible that they represent homozygous deletions? Yes, you are correct. Bright green = homozygous deletions

  2. I cannot figure out why the X chromosome is not placed at the log ratio of zero. The readcounts countain X and Y chromosomes. Interestingly, when I check the params.txt files for the male samples, they are all recognized as male. I am using the panel of normals that comes with ichor (/inst/extdata/HD_ULP_PoN_hg38_500kb_median_normAutosome_median.rds". Is this panel not compatible to use with male samples? I also tried using no normalization at all, but that did not give adequate results, as did using a single negative control sample.

Again, this is not expected behavior of ichorCNA, especially if it is indicating male in the params.txt file. That panel should work for both female and male samples. Are you sure that you the argument --normalizeMaleX is set to TRUE?

CuriusScientist commented 4 years ago

posted a query in google groups which includes the problem mentioned by @peneder as well

https://groups.google.com/a/broadinstitute.org/d/topic/ichorcna/wYddw8Nwegs/discussion