Closed LCarioti closed 6 years ago
How did you install virmet? It seems that prinseq is not found in your PATH.
conda create -n virmet1 python=3.5 virmet -c bioconda
Ubuntu 16.04 LTS 24 Intel(R) Xeon(R) RAM 50 GB
cat ./prinseq.err
?
cat prinseq.err xargs: prinseq: No such file or directory
I tried to fix with a comment line at line 37 of wolfpak
# from virmet.common import prinseq_exe
prinseq_exe = 'prinseq-lite.pl'
# prinseq_exe = 'prinseq'
and then it run.
When it ends
wc -l unique.tsv 667498 unique.tsv
wc -l orgs_list.tsv 1 orgs_list.tsv
$ head -n 2 unique.tsv qseqid sseqid sscinames stitle pident qcovs score length mismatch gapopen qstart qend sstart send staxids M00611 gb|KY555145.1| N/A Caulobacter phage Ccr29, complete genome 91.667 39 44 60 3 2 38 96 150536 150478 1959737
I fixed with a simple query between unique.tsv and viral_seqs_info.tsv
Is it correct?
If I run blastn
blastn -task megablast -query input_file.fa -db /data/virmet_databases/viral_nuccore/viral_db -out final.contigs.txt -outfmt "6 qseqid sseqid sscinames stitle pident qcovs score length mismatch gapopen qstart qend sstart send staxids"
Warning: [blastn] Taxonomy name lookup from taxid requires installation of taxdb database with ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz
~
What sample are you trying to analyze?
If you want to run blast by hand, you need to set the environment variable BLASTDB
to the directory where blast can find taxdb files (they should be in /data/virmet_databases/
). The equivalent of the python line os.environ['BLASTDB'] = DB_DIR
.
I made by megahit (Li, D Bioinformatics 2015) a de novo assembly with the file viral_reads.fastq and then i assigned the taxonomy by blast using the viral DB of virmet (/data/virmet_databases/).
I don't know why but i can't assing sscinames. The same issue of unique.tsv
head unique.tsv qseqid sseqid sscinames stitle pident qcovs score length mismatch gapopen qstart qend sstart send staxids M00611:16081:N:0:1 gb|KY555145.1| N/A Caulobacter phage Ccr29, complete genome 91.667 39 44 60 3 2 38 96 150536 150478 1959737 M00611:21291:N:0:1 gb|KY094066.1| N/A BeAn 58058 virus, complete genome 85.841 75 65 113 16 0 7 119 8478 8590 67082 M00611:21301:N:0:1 gb|EF380009.1| N/A Enterobacteria phage phiX174 isolate AP100, complete genome 99.020 68 99 102 1 0 1 102 1638 1537 10847 M00611:21401:N:0:1 gb|EF380025.1| N/A Enterobacteria phage phiX174 isolate 10D90, complete genome 92.481 89 103 133 10 0 1 133 401 533 10847 M00611:21401:N:0:1 gb|EF380009.1| N/A Enterobacteria phage phiX174 isolate AP100, complete genome 91.667 95 107 144 10 2 1 143 4437 4295 10847 M00611:21521:N:0:1 gb|AY037928.1| N/A Human endogenous retrovirus K113 complete genome 87.770 92 88 139 17 0 1 139 2343 2481 166122 M00611:21671:N:0:1 gb|J02482.1| N/A Coliphage phi-X174, complete genome 96.000 100 132 150 6 0 1 150 413 264 10847 M00611:21801:N:0:1 gb|J02482.1| N/A Coliphage phi-X174, complete genome 92.715 100 118 151 11 0 1 151 1874 1724 10847 M00611:21811:N:0:1 gb|J02482.1| N/A Coliphage phi-X174, complete genome 95.364 99 129 151 5 2 1 150 4584 4435 10847
Do you have taxdb.btd
and taxdb.bti
in /data/virmet_databases
?
host:~ l -1 /data/virmet_databases
total 107M
-rw-r--r-- 1 ozagordi ngs 11M Apr 10 2017 taxdb.bti
-rw-r--r-- 1 ozagordi ngs 96M Apr 10 2017 taxdb.btd
drwxr-xr-x 4 ozagordi ngs 4.0K Apr 18 2017 human/
drwxr-xr-x 4 ozagordi ngs 4.0K Apr 18 2017 bacteria/
drwxr-xr-x 4 ozagordi ngs 4.0K Apr 18 2017 fungi/
drwxr-xr-x 4 ozagordi ngs 4.0K Apr 18 2017 bovine/
drwxr-xr-x 3 ozagordi ngs 4.0K Apr 18 2017 viral_nuccore/
No
/data/virmet_databases/viral_nuccore$ ls -1trhc ncbi_search viral_seqs_info.tsv viral_accn_taxid.dmp viral_database.fasta blast.perf blast.log viral_db.nsq viral_db.nsi viral_db.nsd viral_db.nog viral_db.nin viral_db.nhr viral_db.nhi viral_db.nhd
I have built my DB by
virmet fetch --viral n
virmet index --viral n
I will investigate. In the meanwhile, can you try to download them from ncbi, set the environment variable and run blast again?
Written on a touch screen. Please excuse any typos.
On Dec 12, 2017, at 18:29, staltor notifications@github.com wrote:
No
/data/virmet_databases/viral_nuccore$ ls -1trhc ncbi_search viral_seqs_info.tsv viral_accn_taxid.dmp viral_database.fasta blast.perf blast.log viral_db.nsq viral_db.nsi viral_db.nsd viral_db.nog viral_db.nin viral_db.nhr viral_db.nhi viral_db.nhd
I have built my DB by
virmet fetch --viral n
virmet index --viral n
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
thanks
I will try it
thanks
everything runs well
Hi,