McGranahanLab / CONIPHER-wrapper

8 stars 2 forks source link

Question about the input file #1

Open Yunuuuu opened 1 year ago

Yunuuuu commented 1 year ago

Hi, I read the your three research papers, These works will surely promote our understanding of tumor evolution.

We are also interested in the tumor clonal progress and want to use the methods provided in the above three works. Code of the first work is deposited here and of the second work is deposited in https://bitbucket.org/nmcgranahan/clonalneoantigenanalysispipeline/src/master/. I campared the input files of both repos.

In this repo, you used integer copy number (as indicated by COPY_NUMBER_A and COPY_NUMBER_B https://github.com/McGranahanLab/CONIPHER-wrapper/blob/main/data/input_tsv.tsv) for function create.subclonal.copy.number https://github.com/McGranahanLab/CONIPHER-wrapper/blob/b58235d1cb42d5c7fd54122dc6b9f5e6c4110a75/src/TRACERxHelperFunctions.R#L1

while the deposited code of (Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade) (the same function https://bitbucket.org/nmcgranahan/clonalneoantigenanalysispipeline/src/296734efc3c4a05df0f8da0b8ea472213ea26216/clonalDissectionFunctions.R#lines-553) use a raw copy number computed by ASCAT (the example input file was in https://bitbucket.org/nmcgranahan/clonalneoantigenanalysispipeline/downloads/ExampleFiles.zip)

Some questions:

  1. Does the format of copy number (integer or raw copy number) matters for this clustering analysis?
  2. If integer copy number can be used, Can we use integer copy number computed by sequenza for this analysis?
  3. For ASCAT analysis, we have two segment results, if use numeric copy number (with decimal) (segments_raw: https://github.com/VanLoo-lab/ascat/blob/8e16a1ff78e3c3210ab79bd8c020904905474ca5/ASCAT/R/ascat.runAscat.R#L28), there are some failed arrays, should we remove them? If we need, is there any method to do this, I find sements (without failed arrays but integer copy number) and segments_raw much difference but the failedarrays result is length zero (so I don't know how to remove these from the segments_raw result)
Yunuuuu commented 1 year ago

It's strange to use integer copy number. In this situation, nMaj1 and nMaj2 will always equal to COPY_NUMBER_A and fracMaj1 will always equal to zero

seg.out$nMaj1     <- floor(as.numeric(seg.out$COPY_NUMBER_A))
seg.out$nMaj2     <- ceiling(as.numeric(seg.out$COPY_NUMBER_A))
seg.out$fracMaj1  <- as.numeric(seg.out$nMaj2) - as.numeric(seg.out$COPY_NUMBER_A)

in this line https://github.com/McGranahanLab/CONIPHER-wrapper/blob/b58235d1cb42d5c7fd54122dc6b9f5e6c4110a75/src/TRACERxHelperFunctions.R#L15 the comparison will always TRUE for a positive min.subclonal

Yunuuuu commented 1 year ago

Another question would be https://github.com/McGranahanLab/CONIPHER-wrapper/blob/b58235d1cb42d5c7fd54122dc6b9f5e6c4110a75/src/TRACERxHelperFunctions.R#L706 calculate_phylo_ccf run in every region which is the sample. As f.function returns the minimal value across the sample mutation (see below). So all mutation position in the same sample will have the same expected VAF? If pmin function is more suitable ? As expected VAF is calcualted for every mutation in a sample but in above code f.function is evaluated for a region (defined by SampleID) instead of rowwise evaluation. the f.function is defined here

  f.function <- function (c,purity,local.copy.number,normal.copy.number)
  {

    return(min(1,c((purity*c) / (normal.copy.number*(1-purity) + purity*local.copy.number))))

  }