ksamuk / pixy

Software for painlessly estimating average nucleotide diversity within and between populations
https://pixy.readthedocs.io/
MIT License
115 stars 14 forks source link

0 count of count_diffs and count_comparisons and all in count_missing #104

Open hrluo93 opened 4 months ago

hrluo93 commented 4 months ago

Hi,

I used: bcftools mpileup -f reference.fa -b bamlist.txt -r chr10 | bcftools call -m -Oz -f GQ -o chr10-bcf.vcf.gz to get all sites vcf

Then

I used command: pixy --stats pi dxy fst --populations poplist2.txt --window_size 50000 --vcf chr10-bcf.vcf.gz --n_cores 2 --output_prefix chr10gipixy > chr10pixy.log

the result was NA, count_diffs and count_comparisons were both 0 and all is count_missing

image I upload a small chr18 where showed the same results.

chr10pixy.log chr18-bcf.filter.vcf.gz

poplist2.txt

Have I missing some step?

Best!

smallfishcui commented 1 month ago

Same here. Did you figure out the reason?

thanks, Cui

JeffWeinell commented 1 month ago

I also have this problem!

ksamuk commented 1 month ago

This is a bug, and the current solution is to roll back to pixy 1.2.5 (all calculations are unchanged relative to the current version).

hrluo93 commented 1 month ago

Same here. Did you figure out the reason?

thanks, Cui

Hi, I finally used ANGSD to Calculate Dxy.

hrluo93 commented 1 month ago

This is a bug, and the current solution is to roll back to pixy 1.2.5 (all calculations are unchanged relative to the current version).

Thanks for your reply! I am looking forward to the pipeline for dealing sex-chromosome hemizygous region.

hrluo93 commented 1 month ago

I also have this problem!

It seems like a bug.

JeffWeinell commented 1 month ago

Downgrading to pixy 1.2.5 solved it for me, thanks!

smallfishcui commented 1 month ago

Thanks for the answers! Downgrading to pixy 1.2.5 worked for me as well!

Cui

smallfishcui commented 1 month ago

This is a bug, and the current solution is to roll back to pixy 1.2.5 (all calculations are unchanged relative to the current version).

Thanks for your reply! I am looking forward to the pipeline for dealing sex-chromosome hemizygous region.

would ploidy be a problem with ANGSD?

hrluo93 commented 1 month ago

This is a bug, and the current solution is to roll back to pixy 1.2.5 (all calculations are unchanged relative to the current version).

Thanks for your reply! I am looking forward to the pipeline for dealing sex-chromosome hemizygous region.

would ploidy be a problem with ANGSD?

Hi!

I guess that the hemizygous region would be a problem for both pixy and ANGSD. If your species were 2 ploidy or 4 ploidy but treated it as 2 ploidy two subgenome, it would not be a problem.

For ANGSD, I guess that the problem is all non-SNP sites are considered invariant sites. It is not a problem for my species because of the low ratio (~10%) of repeat sequences. But if the target species had a ratio of complex repeats, it would be a problem. As well as this opinion(https://x.com/jrossibarra/status/1753102333331042622).

smallfishcui commented 1 month ago

This is a bug, and the current solution is to roll back to pixy 1.2.5 (all calculations are unchanged relative to the current version).

Thanks for your reply! I am looking forward to the pipeline for dealing sex-chromosome hemizygous region.

would ploidy be a problem with ANGSD?

Hi!

I guess that the hemizygous region would be a problem for both pixy and ANGSD. If your species were 2 ploidy or 4 ploidy but treated it as 2 ploidy two subgenome, it would not be a problem.

For ANGSD, I guess that the problem is all non-SNP sites are considered invariant sites. It is not a problem for my species because of the low ratio (~10%) of repeat sequences. But if the target species had a ratio of complex repeats, it would be a problem. As well as this opinion(https://x.com/jrossibarra/status/1753102333331042622).

Thanks for sharing! It seems pixy is much more realistic with regards to treating monomorphic sites then. As far as I know, ANGSD uses site frequency spectrum to calculate the pi and theta. I just wonder how much it is differed from pixy, especially for polyploid data. And is it okay to compare with tetraploid and diploid pi? If someone can help, it's very much appreciated.

thanks, Cui