bioinformatics-centre / BayesTyper

A method for variant graph genotyping based on exact alignment of k-mers
87 stars 7 forks source link

[feature request] Genotyping more than 500 samples #34

Open Sherry520 opened 3 years ago

Sherry520 commented 3 years ago

I need to make a genotyping more than 500 samples, I hope the software can add this feature

jonassibbesen commented 3 years ago

Hi, due to how the genotyping algorithm is designed it will unfortunately not scale well too that many samples. What you could do instead is to combine all predicted variants across all 500 samples and then run BayesTyper on each sample independently or in batches using this combined variants set. See here for more information: https://github.com/bioinformatics-centre/BayesTyper/wiki/Executing-BayesTyper-on-sample-batches

Please let me know if you have any other questions.

Sherry520 commented 3 years ago

@jonassibbesen I followed the method “Executing BayesTyper on sample batches” to genotyping my samples, when I Combine the the batch vcf files using bcftools merge, Error occured: Failed to open bayestyper_rmdup_DH_00_unit_1/bayestyper-sk-b73.vcf.gz: not compressed with bgzip it seems like bayestypre use gzip to commpress the vcf files, But bcftools require bgzip commpressed vcf files

jonassibbesen commented 3 years ago

Thank you for mentioning this. I have updated the wiki with an additional bgzip compression step.