Open hans-vg opened 5 months ago
I tried to reproduce this using swissprot but it works fine, will try full nr now.
I could not reproduce the error using the nr database either. Would it be possible to send me your query file?
I never could get nr to build with the provided command. I did follow the process outlined here (https://blobtoolkit.genomehubs.org/install/ ) to setup uniprot which worked fine though. I'm using nt and uniprot data now for loading hits.
It could possibly be my nr file didn't fully download. I'll try redownloading and reindexing with diamond.
Version: diamond v2.1.8.162 installed from bioconda Database: NR downloaded from ncbi 2023/07/28 database was built using the makedb command and providing taxon files.
$diamond makedb --threads 32 --in $db --db /common_references/ncbi_nr/nr_20230728/ --taxonmap $taxid/prot.accession2taxid.FULL --taxonnodes $taxid/nodes.dmp --taxonnames $taxid/names.dmp
Diamond blastx command:
diamond blastx -p 64 -d /data/gpfs/assoc/inbre/projects/common_references/ncbi_nr/nr_20230728/nr -q ../trinity_out.Trinity.95.fasta -o nr_matches_test.diamond.tsv --outfmt 6 qseqid qlen sseqid slen evalue bitscore stitle qtitle sscinames sskingdoms skingdoms sphylums staxids
Resources: 64 threads/ 220GB of RAM
Tail Error Log:
For other (smaller) databases, diamond works fine. Is this an issue with the taxon information I tried to load into the nr database? I tried the prebuilt NR database from NCBI but using the
prepdb
command, but getting taxon information did not work. I proceeded with following the instructions to make the database myself.