mskilab-org / JaBbA

MIP based joint inference of copy number and rearrangement state in cancer whole genome sequence data.
MIT License
56 stars 25 forks source link

Generate coverage txt file using GRIDSS-PURPLE-LINX #37

Closed alhafidzhamdan closed 3 years ago

alhafidzhamdan commented 3 years ago

Hi there, I've had some technical issues using fragCounter, and hence DryClean. I want to use purity/ploidy adjusted, gc-corrected copy number calls by GRIDSS-PURPLE-LINX as I think this should be similar, and I already have this run. Looking at your coverage.txt file, what is your definition of "ratio" ie last column. I presume it's just ?log2(absolute tumour copy number/absolute normal copy number ?) I appreciate your help! A

alhafidzhamdan commented 3 years ago

@mskilab @xtYao anyone?

xtYao commented 3 years ago

Hi,

Sorry for the late reply. Yes we have used their pipeline for JaBbA too. Their whole genome coverage is generated from a tool called Cobalt, and with that we have been able to run JaBbA. The "field" argument is the metadata column name of the coverage data on each bin, and it is NOT in log space. For example if you have both coverage for tumor and normal, it should just be their ratio, but before purity/ploidy transformation (just provide the purity and ploidy using the respective arguments) and it will be transformed internally.

Hope this helps! (I will work through the fragCounter issue as soon as I can) Xiaotong

alhafidzhamdan commented 3 years ago

Hi @xtYao, thank you- sorry to pester. I appreciate your time.

I had a look at COBALT docs at https://github.com/hartwigmedical/hmftools/blob/master/count-bam-lines/README.md

TUMOR.cobalt.ratio.tsv contains the counts and ratios of the reference and tumor:

Chromosome | Position | ReferenceReadCount | TumorReadCount | ReferenceGCRatio | TumorGCRatio | ReferenceGCDiploidRatio

ratio = TumorReadCount/ReferenceReadCount or ratio = TumorGCRatio/ReferenceGCRatio? or even ratio = TumorGCRatio/ReferenceGCDiploidRatio ##To account for megabase scale GC biases

Which of these columns do you recommend to use to calculate the ratio?

I look forward to eventually be able to use fragCounter. Thank you.

A

xtYao commented 3 years ago

I think you can directly use their "ReferenceGCDiploidRatio" as your "field".