ParBLiSS / FastANI

Fast Whole-Genome Similarity (ANI) Estimation
Apache License 2.0
374 stars 67 forks source link

few basic questions on running fastANI #71

Closed limin321 closed 4 years ago

limin321 commented 4 years ago

Hi,

I am able to run fastANI with default settings. However, I want to set flag -k=51 with the following code.

fastANI --ql fnames.txt --rl fnames.txt -k 51 --matrix -o ../ANIoptK51/optK51.txt There is no error message, but the output file is empty. and It only ran less than 5 mins. With default setting, it took more than 11 hours to finish. I include more than 100 bacterial draft genomes. Can anyone please explain what might go wrong?

Also, both the --rl and --ql asking for file containing all the filenames. When creating the tree, still the tree name will have the file extension like .fna with it. Is there anyway to remove the extension when running fastANI. This is actually not an issue because we can clean the output data in python. I am just curious to ask. Thanks.

Your help are greatly appreciated. Best, Limin

cjain7 commented 4 years ago

Please see the help page fastani -h. I think it specifies that the maximum k value supported is 16. This is because FastANI uses 32-bit integer values to store k-mers. As we need 2-bit per base, k above 16 are not supported.

I wonder why would you want to run FastANI with k 51.

cjain7 commented 4 years ago

--ql is the set of query genomes, --rl is the set of target genomes. In your particular case, if you wish to do all-vs-all, then you'd need to specify the same file twice. FastANI doesn't do any file name manipulation internally, so I recommend you post-process your output.

limin321 commented 4 years ago

@cjain7 Thank you so much for all the explanation, that is very help.

Greatly appreciate that.

Best, Limin