soedinglab / metaeuk

MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics
GNU General Public License v3.0
178 stars 23 forks source link

Fasta entry: 0 is invalid on RefSeq Fasta #7

Closed openpaul closed 4 years ago

openpaul commented 4 years ago

Expected Behavior

Run the pipeline

Current Behavior

Terminates with error

Converting sequences
Fasta entry: 0 is invalid. 
Error: targets createdb died

Steps to Reproduce (for bugs)

Using the metaeuk-linux-sse41 version:

wget ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/709/125/GCF_000709125.1_Exop_aqua_CBS_119918_V1/GCF_000709125.1_Exop_aqua_CBS_119918_V1_genomic.fna.gz
gunzip GCF_000709125.1_Exop_aqua_CBS_119918_V1_genomic.fna.gz

./metaeuk/bin/metaeuk easy-predict GCF_000709125.1_Exop_aqua_CBS_119918_V1_genomic.fna  \
               2020_TAX_DB \
               results.fasta \
               tmp \
               --threads  1 \
               --slice-search

MetaEuk Output (for bugs)

https://gist.github.com/openpaul/40c649b99a20f40be25cd412eb7ff987#file-gistfile1-txt

Context

I just wanted to test metaeuk on a genome to see how well it performs, I assumed this is a valid use case as all contigs are by definition (hopefully) eukaryotic.

Your Environment

I am using CentOS with kernel 3.10.0-693.5.2.el7.x86_64 Using the precompiled version with the sse41 instruction set. Using the provided profile database.

openpaul commented 4 years ago

My mistake, I did not verify the downloaded database, it was corrupted.