Closed peterdfields closed 3 years ago
Hi @peterdfields, that does seem odd! That looks like an error from scikit-allel (which pixy uses under the hood), which is discussed here: http://alimanfoo.github.io/2017/06/14/read-vcf.html.
This section seems to address that error specifically:
If you get an error message like “RuntimeError: VCF file is missing mandatory header line (“#CHROM…”)” then check your tabix version and upgrade if necessary. If you have conda installed, a recent version tabix can be installed via the following command: conda install -c bioconda htslib.
Hope that helps, let me know if the error persists and we can try to troubleshoot it more.
Hi @ksamuk
Thank you for getting back to me. I updated htslib to 1.11 and restarted the pixy analysis. The analysis did go further through the reference than before (though it has seemingly gone further each time I've run the command) but the error arose again. I hadn't tabix indexed the vcf.gz file so I did that and restarted pixy though I guess that probably isn't the issue as the analysis was proceeding before. Anyway, let me know if you have any other suggestions I should try or if additional info would be useful.
Hi @peterdfields, sorry to hear this still isn't working. Are you getting the same error message as before? If you'd be willing to send me your VCF (the whole thing, or in part), I can see if I can get it working on my end. You can post a link here, or reach me at ksamuk@gmail.com (sharing the VCF via dropbox or the like might be easiest).
Hi @ksamuk After I indexed the .vcf.gz file with tabix the command both completed and ran considerably faster. I may be missing instructions in the materials, and realize it's probably obvious to a lot of users, but it might be worthwhile to explicitly state in the tutorial materials that the index is needed for working with the compressed vcf file. Thank you again for your help and for a great piece of software! I'll go ahead and close this issue now.
Thanks for this response, @peterdfields! The next version of pixy (to be released very soon, a performance update) requires a compressed vcf and tabix, and we'll definitely make sure to emphasize that requirement in the docs.
Hi @ksamuk
I'm trying to use pixy to calculate pi on a relatively small dataset (8 diploid individuals). Making an initial test run I used the following command:
The used vcf was was generated using the instructions provided in the online tutorial for bcftools. The error I see is the following:
The curious issue is that if I re-run this command the error message can arise after processing a different number of contigs. So I'm not entirely sure what might be going wrong. Please let me know if any additional information would be helpful to troubleshoot this error.