Columns of CNAbyGene.tsv file

chr, start, end, gene: should be self explanatory x1, x2: single genome-wide coordinate superFreq uses internally running from 1 to ~3B across all chromosomes. M, width: Log fold change and uncertainty of the read count with respect to the reference normals. df: degrees of freedom of the t distribution used to model the log fold change and error above (from limma-voom) var, cov, Nsnps: across all heterozygous germline variants in the gene, the number of minor (as in lowest VAF) allele counts (var), total read depth (cov) and number of variants (Nsnps) pHet, pAlt, odsHet: p value for the null hypothesis of balanced alleles (pHet), for the allelic balance var/cov (pAlt), and the ods between the two hypotheses (odsHet).

There are some pretty involved stats going on for both the read depth and the BAFs, I believe that should be somewhat covered in the manual, and otherwise in the paper.

ChristofferFlensburg / superFreq

Columns of CNAbyGene.tsv file #125