ksamuk / pixy

Software for painlessly estimating average nucleotide diversity within and between populations
https://pixy.readthedocs.io/
MIT License
115 stars 14 forks source link

window_size 1 #56

Closed fxfish closed 2 years ago

fxfish commented 2 years ago

Hi, Thanks for the nice software which can also analyze polyploid data. If I set --window_size 1, why put π value out for each genome position, the output file is very big, and the nanlysis is very slow. Or I set the wrong parametre.

All the best

Describe the problem you are having A clear and concise description of what problem you are encountering.

A reproducible example of the problem If your question is about how your dataset is specifically being processed by pixy, please copy and paste the following: (1) The command you used to run pixy, including all arguments (2) A subset of your VCF and the full populations file (3) Any other files needed to reproduce the problem (sites, bed file, etc.)

ksamuk commented 2 years ago

Hi there,

Just a note that we don't officially support polyploid datasets, as we haven't had the chance to test all the complexities of dealing with polyploids. So, we don't currently recommend using pixy for polyploid organisms.

Re: the single site output, the software is not optimized for single-site output (the summary stats we focus on are meant to be calculated in windows), and also the output file will always be large if all sites are being output. The single site mode is meant to be used for 'zooming in' on specific regions, not whole genome output.

Cheers,

Kieran