GDKO / AvP

Automatic evaluation of HGTs
GNU General Public License v3.0
22 stars 2 forks source link

Error: invalid taxonomic rank: #20

Closed coles5 closed 5 months ago

coles5 commented 5 months ago

I am currently having an issue generating a database using Diamond. I have downloaded uniref90 and the taxdump from ncbi's website and run through the setup by copying the commands from the installation tutorial.

When I attempt to make the database using Diamond I get an error of invalid taxonomic rank. I had re-created the error here with a smaller database which only has one protein sequence, but generates the same error.

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1) MAX_SHAPE_LEN=19 SEQ_MASK STRICT_BAND Database input file: uniref90-small.fasta Opening the database file... [0.001s] Loading sequences... Sequences = 1, letters = 540, average length = 540 [0.002s] Masking sequences... [0.002s] Writing sequences... [0s] Writing accessions... [0s] Hashing sequences... [0s] Loading sequences... [0s] Writing trailer... [0s] Accession parsing rules triggered for database seqids (use --no-parse-seqids to disable): UniRef prefix 0 gi|xxx| prefix 0 xxx| prefix 1 |xxx suffix 0 .xxx suffix 0 :PDB= suffix 0 Loading taxonomy names... [0.743s] Loaded taxonomy names for 2575973 taxon ids. Loading taxonomy mapping file... [116.034s] Joining accession mapping... [26.592s] Writing taxon id list... [0.002s]

Accession parsing rules triggered for mapping file seqids (use --no-parse-seqids to disable): UniRef prefix 0 gi|xxx| prefix 0 xxx| prefix 0 |xxx suffix 0 .xxx suffix 0 :PDB= suffix 0 Error: Invalid taxonomic rank:

I am unsure if it is a problem with the taxdump files, the un.taxids file or something else. Any help in this matter would be most wonderful