Hi,
I noticed a similar issue to a previously closed one (#80 ), which I'm experiencing with the most recent version (2.5.1) of snp-sites.
It appears that sequences longer than 2,147,483,647 bases give the error "Warning: No SNPs were detected so there is nothing to output." 2,147,483,647 is the maximum value for a 32 bit signed integer.
I've spent a bit of time looking into this and here's what I've done to prove this.
I took two sequences from an alignment, one of which was the outgroup, so as to maximise the number of snps.
Each sequence was 2,423,158,460 bases in length:
$ cat sample1.fasta Outgroup.fasta > test.fasta
$ snp-sites -V
snp-sites 2.5.1
$ snp-sites -c -o test_snps.fasta test.fasta
Warning: No SNPs were detected so there is nothing to output.
I then cut the length of the sequences down 2,147,483,648 - one base longer than 32 bit signed integer maximum value:
$ cut -c 1-2147483648 test.fasta > test1.fasta
$ snp-sites -c -o test1_snps.fasta test1.fasta
Warning: No SNPs were detected so there is nothing to output.
I then cut the length of the sequence down 2,147,483,647 - the 32 bit signed integer maximum value:
This time snp-sites ran successfully and identifies 28,880,245 variant sites
So it seems that sequence-lengths which are at the limit of a 32 bit signed integer maximum value cause a segmentation fault, and when you go over that limit causes snp-sites to suggest there are no SNPs
Hi, I noticed a similar issue to a previously closed one (#80 ), which I'm experiencing with the most recent version (2.5.1) of snp-sites. It appears that sequences longer than 2,147,483,647 bases give the error "Warning: No SNPs were detected so there is nothing to output." 2,147,483,647 is the maximum value for a 32 bit signed integer. I've spent a bit of time looking into this and here's what I've done to prove this.
I took two sequences from an alignment, one of which was the outgroup, so as to maximise the number of snps. Each sequence was 2,423,158,460 bases in length:
I then cut the length of the sequences down 2,147,483,648 - one base longer than 32 bit signed integer maximum value:
I then cut the length of the sequence down 2,147,483,647 - the 32 bit signed integer maximum value:
I then cut the length of the sequence down 2,147,483,646 - one base less than the 32 bit signed integer maximum value:
This time snp-sites ran successfully and identifies 28,880,245 variant sites
So it seems that sequence-lengths which are at the limit of a 32 bit signed integer maximum value cause a segmentation fault, and when you go over that limit causes snp-sites to suggest there are no SNPs
Graham