tfwillems / HipSTR

Genotype and phase short tandem repeats using Illumina whole-genome sequencing data
GNU General Public License v2.0
94 stars 31 forks source link

Processing HipSTR with different BED file chunks and merging the output - Phasing process #91

Open think-o opened 1 year ago

think-o commented 1 year ago

Hello,

Since the HipSTR does not have a multi-threading support, to ease the compute time I have processed the data by dividing the regions of the BED file into 100 chunks (As I have about 3 million locations) - completely based on the number of cpu cores I have to process the data for one (.bam) file. Then I have used "bcftools concat" to concatenate the output.

Now I would like to know if that affects the phasing part. Like, for one file during phasing the reference haplotype is on one side and for the other it is another. It would be great to know about the phasing part in detail.

As the manual suggested to run chromosome wise to support multithreading, I assume this should not be a problem too. Please let me know if otherwise.

Thanks, Mali