raphael-group / hatchet

HATCHet (Holistic Allele-specific Tumor Copy-number Heterogeneity) is an algorithm that infers allele and clone-specific CNAs and WGDs jointly across multiple tumor samples from the same patient, and that leverages the relationships between clones in these samples.
BSD 3-Clause "New" or "Revised" License
66 stars 30 forks source link

Incomplete normal.1bed and tumor.1bed files in count-alleles #159

Open mlewinsohn opened 2 years ago

mlewinsohn commented 2 years ago

I am running HATCHet v1.1 (recently upgraded from v1.0) in bioconda. I am having trouble with count-alleles generating incomplete output files. I am working within the hatchet run pipeline, but have also tried calling the count-alleles step directly.

hatchet count-alleles -v -N ${normal} -T ${bams} -S ${samples} -r ${reference} \
                          -j ${processes} -L ${SNP} --mincov 20 --maxcov 500 \
                          -O ${BAF}normal.1bed -o ${BAF}tumor.1bed

Where my complete dataset has 48 samples with 1 matched normal, and all autosomal chromosomes are included.

I am noticing that when I set -j to 1, only variants from chromosome 22 are written to normal.1bed and tumor.1bed. The number of chromosomes written seems to match the number of processes. So, for example, if I set -j 8 then chromosomes 15-22 are included in the output file. If I set --processes=22 (the number of chromosomes), then all chromosomes are written in normal.1bed, but only one sample is written in tumor.1bed.

I think this was noted in a previous issue, but might be a separate problem.

Thank you!

vineetbansal commented 2 years ago

Hi @mlewinsohn - thanks for posting the issue, and sorry for the trouble. If this is happening in the count_alleles step it is likely a bug introduced in v1.0.2 that didn't get resolved in v1.1 as we would have hoped. Fortunately this behavior is somewhat easily reproducible on our end, and thus very fixable. Let me take a deep dive into HATCHet v1.1 and get back to you.

mlewinsohn commented 2 years ago

Hi @vineetbansal, thanks for the prompt reply and for looking into this!