zjshi / gt-pro

MIT License
23 stars 7 forks source link

add second input to sckmer builddb (monospecific SNPs whitelists) #14

Closed boris-dimitrov closed 5 years ago

boris-dimitrov commented 5 years ago

the input 'DDDDDD.sckmer_allowed.tsv' is presumed to contain a subset of the SNPs that are known to occur only in just one species

example DB build

    time /path/to/sckmerdb_build *.sckmer_allowed.tsv > ../db.bin

where

    ls | sort | head
    100002.sckmer_allowed.tsv
    100002.sckmer_profiles.tsv
    100003.sckmer_allowed.tsv
    100003.sckmer_profiles.tsv
    ...

note that the 'DDDDDD.sckmer_profiles.tsv' input is still required and will be automatically located given 'DDDDDD.sckmer_allowed.tsv' on the command line