tfwillems / HipSTR

Genotype and phase short tandem repeats using Illumina whole-genome sequencing data
GNU General Public License v2.0
94 stars 31 forks source link

samples from different projects #75

Closed zhangguy closed 4 years ago

zhangguy commented 4 years ago

Hi,

I have two cohorts and want to joint analyze them, each cohort is from a different project and thus the library and sequencing configurations are different. Should I run hipstr separately for each cohort and merge the vcf, or run two cohorts together? I'm planning to do "Use de novo stutter estimation + STR calling with de novo allele generation".

And if the suggestion is to run hipstr on the two cohorts together, is there a strategy to analyze large amount of deeply sequence WGS samples, say 1000 samples at 100x without downloading all bam files to local?

Thanks!