tseemann / snp-dists

Pairwise SNP distance matrix from a FASTA sequence alignment
GNU General Public License v3.0
126 stars 28 forks source link

Add max diff option (-x) #54

Open boasvdp opened 6 months ago

boasvdp commented 6 months ago

Stops counting SNP differences above a certain threshold (default: 9_999_999). Adds nearly negligible overhead by comparing diff to maxdiff (https://github.com/boasvdp/snp-dists/blob/master/main.c#L28).

Comparison on an alignment of 500 simulated (non-masked) TB genomes which differ ~2000 SNPs on average, measured using /usr/bin/time -v and 8 threads:

Method Max diff Elapsed (wall clock) time (m:ss)
original snp-dists N/A 2:15.59
this PR 9_999_999 2:16.95
this PR 1000 1:09.18
this PR 200 0:18.94

Inspired by the -x option in cgmlst-dists.