raphael-group / hatchet

HATCHet (Holistic Allele-specific Tumor Copy-number Heterogeneity) is an algorithm that infers allele and clone-specific CNAs and WGDs jointly across multiple tumor samples from the same patient, and that leverages the relationships between clones in these samples.
BSD 3-Clause "New" or "Revised" License
69 stars 32 forks source link

Specifying size for WES samples in count-reads #199

Open bjlerman opened 1 year ago

bjlerman commented 1 year ago

Hi, I see in the instructions for the WES demo that 250 kb should be selected as the bin size in count-reads, but I don't see that as an accepted argument. When I include --size 250kb in the code anyways it throws an error. The demo references -b as the argument for this but this is currently used to specify the .1bed file...is the size instruction from an older version of hatchet? If so, are there new recommendations for how to optimize for WES data?

Thanks so much!

mmyers1 commented 9 months ago

You are correct, the WES demo is currently out-of-date. You can follow the instructions in the complete demo: https://raphael-group.github.io/hatchet/examples/demo-complete/demo-complete.html

The variable-width binning in recent HATCHet versions should adjust automatically to the coverage density in WES, but you may want to tune the minimum SNP-covering reads msr and minimum total reads mtr per bin if you find that the signal looks noisy. These are parameters to combine_counts step in HATCHet that implicitly modify the bin size: higher values produce larger bins on average, and vice versa.