muellan / metacache

memory efficient, fast & precise taxnomomic classification system for metagenomic read mapping
GNU General Public License v3.0
57 stars 12 forks source link

metacache-build-refseq and download-ncbi-genomes not downloading genomes #6

Closed sturne29 closed 6 years ago

sturne29 commented 6 years ago

Just tried to build the database for Metacache, but neither of these commands are actually downloading the genomes. I tried the metacache-build-refseq first, then when it didn't download any, I tried the download-ncbi-genomes, thinking I might need to download them 'manually', as it were. Neither command gives an error per se; both download the assembly_summary.txt file, then when that's finished, give the following line and return to prompt:

cat: ftpdirpaths: input file is output file

Any ideas what I'm doing wrong? I was really intrigued by the paper and wanted to see how the tool performed on my data.

muellan commented 6 years ago

Maybe the NCBI changed something on their servers or I accidentaly broke the download script on one of the last commits. I'll check it out.

muellan commented 6 years ago

Fixed it. The scripts should work again.

muellan commented 6 years ago

By the way: Note that the NCBI RefSeq has grown quite a bit since the paper was published. This means the default database is now around 28GB on disk and around 37 GB in memory. This also impacts the query speed.

If you want to process the mapping data with some other tool you should also have a look at the output formatting options (shortly explained in the README and in "docs/query.txt".

sturne29 commented 6 years ago

Thank you, I appreciate it!