dnbaker / dashing

Fast and accurate genomic distances using HyperLogLog
GNU General Public License v3.0
160 stars 11 forks source link

example files #36

Closed maojn closed 4 years ago

maojn commented 4 years ago

Hi,

Are there example files somewhere I can download to test dashing? Thanks. Jean

dnbaker commented 4 years ago

Hi Jean,

You might actually already have some example files already in dashing/bonsai/test/, as GCF*.gz (such as GCF_000302455.1_ASM30245v1_genomic.fna.gz). (You will if you cloned recursively, though not if you installed through conda.)

If not, you can download them from this folder here, or, alternatively, acquire some genome assemblies from RefSeq. If you want to download them programmatively, our download_genomes.py script might be handy.

Let me know if you have any more questions.

maojn commented 4 years ago

Thank you for your prompt reply. In your README file it says: dashing dist -k31 -p13 -Odistance_matrix.txt -osize_estimates.txt genome1.fna.gz genome2.fna genome3.fasta <...>

Is it possible for you to provide all the required input files and the exact command to use so that I can test run real quick? I am installing dashing for users in our community. Just want to make sure it works before I let users know. I don't work in the field and don't have required files handy. Thank you very much.

Regards,

dnbaker commented 4 years ago

Why don't you try running these lines from the Travis CI configuration?

https://github.com/dnbaker/dashing/blob/master/.travis.yml#L17-L18?

You can do this by:

git clone --recursive https://github.com/dnbaker/dashing/ && cd dashing && make dashing
./dashing dist bonsai/test/GCF_00*z

It should emit output similar to

Dashing version: v0.4.2-8-ga744
#Path   Size (est.)
bonsai/test/GCF_001723155.1_ASM172315v1_genomic.fna.gz  4829255
bonsai/test/GCF_000302455.1_ASM30245v1_genomic.fna.gz   2718859
bonsai/test/GCF_000953115.1_DSM1535_genomic.fna.gz  2433839
bonsai/test/GCF_000762265.1_ASM76226v1_genomic.fna.gz   2368528
##Names bonsai/test/GCF_001723155.1_ASM172315v1_genomic.fna.gz  bonsai/test/GCF_000302455.1_ASM30245v1_genomic.fna.gz   bonsai/test/GCF_000953115.1_DSM1535_genomic.fna.gz  bonsai/test/GCF_000762265.1_ASM76226v1_genomic.fna.gz
bonsai/test/GCF_001723155.1_ASM172315v1_genomic.fna.gz  -   0   0   0
bonsai/test/GCF_000302455.1_ASM30245v1_genomic.fna.gz   -   -   0   0
bonsai/test/GCF_000953115.1_DSM1535_genomic.fna.gz  -   -   -   0.550403
bonsai/test/GCF_000762265.1_ASM76226v1_genomic.fna.gz   -   -   -   -

I suppose, if you don't want to use git, you can instead:

wget https://github.com/dnbaker/bonsai/raw/master/test/GCF_001723155.1_ASM172315v1_genomic.fna.gz
wget https://github.com/dnbaker/bonsai/raw/master/test/GCF_000302455.1_ASM30245v1_genomic.fna.gz
wget https://github.com/dnbaker/bonsai/raw/master/test/GCF_000762265.1_ASM76226v1_genomic.fna.gz
wget https://github.com/dnbaker/bonsai/raw/master/test/GCF_000953115.1_DSM1535_genomic.fna.gz

dashing dist GCF_001723155.1_ASM172315v1_genomic.fna.gz GCF_000302455.1_ASM30245v1_genomic.fna.gz GCF_000762265.1_ASM76226v1_genomic.fna.gz GCF_000953115.1_DSM1535_genomic.fna.gz 
maojn commented 4 years ago

It worked! Thank you very much!

dnbaker commented 4 years ago

Great! Let me know if you have any further questions.