Closed mlegarreta00 closed 1 month ago
chr, start, end, gene: should be self explanatory x1, x2: single genome-wide coordinate superFreq uses internally running from 1 to ~3B across all chromosomes. M, width: Log fold change and uncertainty of the read count with respect to the reference normals. df: degrees of freedom of the t distribution used to model the log fold change and error above (from limma-voom) var, cov, Nsnps: across all heterozygous germline variants in the gene, the number of minor (as in lowest VAF) allele counts (var), total read depth (cov) and number of variants (Nsnps) pHet, pAlt, odsHet: p value for the null hypothesis of balanced alleles (pHet), for the allelic balance var/cov (pAlt), and the ods between the two hypotheses (odsHet).
There are some pretty involved stats going on for both the read depth and the BAFs, I believe that should be somewhat covered in the manual, and otherwise in the paper.
Good morning, I was wondering if someone knows what the different columns mean in the CNAbyGene_{sample}.tsv (the M columnt, the width column, etc.). Thank you in advance.