Closed NailouZhang closed 2 years ago
Hi @NailouZhang ! Thanks for your interest in the pipeline.
It seems that the Python script filter_contigs_len.py
which is located in the bin
folder of the cloned repository can not be found correctly via your execution of the pipeline (https://github.com/EBI-Metagenomics/emg-viral-pipeline/blob/master/bin/filter_contigs_len.py). The bin
folder should be located here:
~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline
Can you please try the following: just install the pipeline code directly via Nextflow:
# pull pipeline code
nextflow pull EBI-Metagenomics/emg-viral-pipeline
# test run w/ latest release
nextflow run EBI-Metagenomics/emg-viral-pipeline -r v0.4.0 --help
# execute your data using the latest release version
nextflow run -resume \
EBI-Metagenomics/emg-viral-pipeline -r v0.4.0 \
--fasta "/home/stone/20T/SraDownload/Genome/TBEV/NC_001672.1_sequence.fasta" \
--cores 4 \
--output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \
--workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \
--databases ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline/DATABASES \
--cachedir ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline/SINGULARITY \
-profile local,singularity
Thanks! but I failed with as fellow:
~/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run -resume \ /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/virify.nf \ --fasta "/home/stone/20T/SraDownload/Genome/TBEV/NC_001672.1_sequence.fasta" \ --cores 4 \ --output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \ --workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \ --databases /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/DATABASES \ --cachedir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/SINGULARITY \ -profile local,docker
Error executing process > 'preprocess:rename (1)'
Caused by:
Process preprocess:rename (1)
terminated with an error exit status (125)
Command executed:
if [[ NC_001672.1_sequence.fasta =~ .gz$ ]]; then zcat NC_001672.1_sequence.fasta > tmp.fasta else cp NC_001672.1_sequence.fasta tmp.fasta fi rename_fasta.py -i tmp.fasta -m NC_001672_map.tsv -o NC_001672_renamed.fasta rename
Command exit status: 125
Command output: (empty)
Command error: Unable to find image 'microbiomeinformatics/emg-viral-pipeline-python3:v1' locally docker: Error response from daemon: Head https://registry-1.docker.io/v2/microbiomeinformatics/emg-viral-pipeline-python3/manifests/v1: dial tcp: lookup registry-1.docker.io on 127.0.1.1:53: read udp 127.0.0.1:46714->127.0.1.1:53: i/o timeout. See 'docker run --help'.
Work dir: /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work/b1/e3cb5d7670f844cb08409030003257
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
but I work well with https://github.com/hoelzer/virify the INSTALL and WORK as fellow:
git clone --recursive https://github.com/hoelzer/virify.git cd virify
docker build -t mhoelzer/prodigal_viral:0.1 -f docker/prodigal/Dockerfile . docker build -t mhoelzer/hmmscan:0.1 -f docker/hmmscan/Dockerfile .
cp bin/ docker/annotation/ cd docker/annotation/ docker build -t mhoelzer/annotation_viral_contigs:0.1 -f Dockerfile . cd .. cp bin/ docker/assign/ cd docker/assign/ docker build -t mhoelzer/assign_taxonomy:0.1 -f Dockerfile . cd ..
cp -R ../emg-viral-pipeline/docker/krona docker/ cd docker/krona docker build -t nanozoo/krona:2.7.1--658845d -f Dockerfile . cd ..
cp -R ../emg-viral-pipeline/docker/bioruby docker/ cd docker/bioruby docker build -t nanozoo/bioruby:2.0.1--1f8a188 -f Dockerfile . cd ..
cd ~/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 ~/Softwares/Miniconda3/nextflow-20.04.1/nextflow run -resume \ ~/20T/DataBase/SoftwaresEnsembel/MAG/virify \ --fasta "/home/stone/20T/SraDownload/Genome/TBEV/NC_001672.1_sequence.fasta" \ --cores 4 \ --output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \ --workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \ --databases ~/20T/DataBase/SoftwaresEnsembel/MAG/virify/DATABASES \ --cachedir ~/20T/DataBase/SoftwaresEnsembel/MAG/virify/SINGULARITY \ -profile standard
[skipped ] process > download_pprmeta:pprmetaGet [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_model_meta:metaGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_virsorter_db:virsorterGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_viphog_db:viphogGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_rvdb_db:rvdbGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_pvogs_db:pvogsGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_vogdb_db:vogdbGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_vpf_db:vpfGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_ncbi_db:ncbiGetDB [100%] 1 of 1, stored: 1 ✔ [75/59a1d0] process > download_imgvr_db:imgvrGetDB [100%] 1 of 1, failed: 1 ✘ [eb/369861] process > detect:rename (1) [100%] 1 of 1, cached: 1 ✔ [68/35b2d4] process > detect:length_filtering (1) [100%] 1 of 1, cached: 1 ✔ [e3/917400] process > detect:virsorter (1) [100%] 1 of 1, cached: 1 ✔ [e7/acbd25] process > detect:virfinder (1) [100%] 1 of 1, cached: 1 ✔ [81/471160] process > detect:pprmeta (1) [100%] 1 of 1, cached: 1 ✔ [8f/a392d7] process > detect:parse (1) [100%] 1 of 1, cached: 1 ✔ [47/da27cb] process > detect:restore (1) [100%] 1 of 1, cached: 1 ✔ [9c/37108a] process > annotate:prodigal (1) [100%] 1 of 1, cached: 1 ✔ [48/5bb7ed] process > annotate:hmmscan_viphogs (1) [100%] 1 of 1, cached: 1 ✔ [d5/0c94fe] process > annotate:hmm_postprocessing (1) [100%] 1 of 1, cached: 1 ✔ [40/dca0b6] process > annotate:ratio_evalue (1) [100%] 1 of 1, cached: 1 ✔ [89/db331a] process > annotate:annotation (1) [100%] 1 of 1, cached: 1 ✔ [08/54d4d4] process > annotate:plot_contig_map (1) [100%] 1 of 1 ✔ [de/cbe75b] process > annotate:assign (1) [100%] 1 of 1, cached: 1 ✔ [d0/6e23ba] process > plot:generate_krona_table (2) [100%] 2 of 2, cached: 2 ✔ [1e/876ef6] process > plot:krona (2) [100%] 2 of 2, cached: 2 ✔ [39/617798] process > plot:generate_sankey_table (1) [100%] 2 of 2 ✔ [6d/bc9086] process > plot:sankey (1) [100%] 2 of 2 ✔ Error executing process > 'download_imgvr_db:imgvrGetDB'
Error executing process > 'download_imgvr_db:imgvrGetDB'
Caused by:
Process download_imgvr_db:imgvrGetDB
terminated with an error exit status (4)
Command executed:
wget -nH ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/viral-pipeline/IMG_VR_2018-07-01_4.tar.gz && tar zxvf IMG_VR_2018-07-01_4.tar.gz
Command exit status: 4
Command output:
Hi @NailouZhang !
Okay, so for your first command that failed it seems that you again used the manually cloned repository instead of the installation via Nextflow
. Can you please try once more:
# pull pipeline code
~/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow pull EBI-Metagenomics/emg-viral-pipeline
# test run w/ latest release
~/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run EBI-Metagenomics/emg-viral-pipeline -r v0.4.0 --help
# execute your data using the latest release version
~/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run -resume \
EBI-Metagenomics/emg-viral-pipeline -r v0.4.0 \
--fasta "/home/stone/20T/SraDownload/Genome/TBEV/NC_001672.1_sequence.fasta" \
--cores 4 \
--output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \
--workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \
--databases /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/DATABASES \
--cachedir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/SINGULARITY \
-profile local,docker
Does this work?
If not, the error you got
Command error:
Unable to find image 'microbiomeinformatics/emg-viral-pipeline-python3:v1' locally
docker: Error response from daemon: Head https://registry-1.docker.io/v2/microbiomeinformatics/emg-viral-pipeline-python3/manifests/v1: dial tcp: lookup registry-1.docker.io on 127.0.1.1:53: read udp 127.0.0.1:46714->127.0.1.1:53: i/o timeout.
See 'docker run --help'.
sounds like some issue with Docker. Does this work:
docker pull microbiomeinformatics/emg-viral-pipeline-python3:v1
Second, you then used an discontinued code repository from the early days of the Nextflow
version of the pipeline (https://github.com/hoelzer/virify). I don't recommend to use this code because it is not maintained anymore. However, it seems that you build here all the necessary Docker images manually and then executed the pipeline which will work (should also work for the EBI-Metagenomics/emg-viral-pipeline code). But: actually this is not necessary because Nextflow
should take care of pulling the Docker images automatically when you use -profile docker,local
for example.
So best would be that you get the code from this repository running, using the nextflow pull
option and a provided -r
release version in combination with the -profile docker,local
that should then download the necessary dependencies automatically - if Docker is configured correctly.
Hi @hoelzer ,
Thanks for your suggestion. Now I can run well with emg-viral-pipeline-0.4.0/tests/parse_viral_fixtures/base_fixtures/input.fasta with :
/home/stone/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run -resume ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/virify.nf \ --fasta "tests/parse_viral_fixtures/base_fixtures/input.fasta" \ --cores 4 --output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \ --workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \ --databases ~/20T/DataBase/SoftwaresEnsembel/MAG/virify/DATABASES \ --cachedir ~/20T/DataBase/SoftwaresEnsembel/MAG/virify/SINGULARITY \ --virome \ --hmmextend \ --blastextend \ --length 1 \ -profile local,docker
I found that the microbiomeinformatics/r_chromomap:v0.1 installed by docker pull cant run well with errors "no chromomap packages", I did some modifications with docker/r_chromomap/Dockerfile
FROM rocker/r-ver:3.5.0
LABEL base_image="rocker/verse:3.5.0" LABEL version="1" LABEL about.summary="r visualization packages" LABEL about.license="SPDX:Apache-2.0" LABEL about.tags="r, visualization" LABEL about.home="https://cran.r-project.org/web/packages/chromoMap/, https://cran.r-project.org/web/packages/ggplot2/, https://cran.r-project.org/web/packages/plotly/" LABEL software="r packages chromoMap, ggplot2, plotly" LABEL software.version="3.15"
LABEL maintainer="MGnify team https://www.ebi.ac.uk/support/metagenomics"
RUN apt update && apt install libcurl4-openssl-dev libssl-dev -y
RUN Rscript -e "install.packages('httr', repos = 'http://cran.us.r-project.org')" && \ Rscript -e "install.packages('curl', repos = 'http://cran.us.r-project.org')" && \ Rscript -e "install.packages('chromoMap')" && \ Rscript -e "install.packages('ggplot2', repos = 'http://cran.us.r-project.org')" && \ Rscript -e "install.packages('plotly', repos = 'http://cran.us.r-project.org')" && \ rm -rf /tmp/downloaded_packages/ /tmp/*.rds
docker build -t microbiomeinformatics/r_chromomap:v0.1 -f docker/r_chromomap/Dockerfile .
I run as follow:
cd ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/ /home/stone/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run -resume \ ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline-0.4.0/virify.nf \ --fasta "/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/Flavivirus.fasta" \ --cores 4 \ --output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \ --workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \ --databases ~/20T/DataBase/SoftwaresEnsembel/MAG/virify/DATABASES \ --cachedir ~/20T/DataBase/SoftwaresEnsembel/MAG/virify/SINGULARITY \ --virome \ --hmmextend \ --blastextend \ --length 1 \ -profile local,docker
[skipped ] process > download_pprmeta:pprmetaGet [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_virsorter_db:virsorterGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_virfinder_db:virfinderGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_model_meta:metaGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_viphog_db:viphogGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_rvdb_db:rvdbGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_pvogs_db:pvogsGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_vogdb_db:vogdbGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_vpf_db:vpfGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_ncbi_db:ncbiGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_imgvr_db:imgvrGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_checkv_db:checkvGetDB [100%] 1 of 1, stored: 1 ✔ [d3/0ade86] process > preprocess:rename (1) [100%] 1 of 1 ✔ [ac/b56dc6] process > preprocess:length_filtering (1) [100%] 1 of 1 ✔ [a8/c75806] process > detect:virsorter (1) [100%] 1 of 1 ✔ [92/9ea9f6] process > detect:virfinder (1) [100%] 1 of 1 ✔ [60/1ef0a9] process > detect:pprmeta (1) [100%] 1 of 1 ✔ [f0/82185a] process > detect:parse (1) [100%] 1 of 1 ✔ [f5/8abf7d] process > postprocess:restore (1) [100%] 1 of 1 ✔ [92/98ab09] process > annotate:prodigal (1) [100%] 1 of 1 ✔ [e2/713785] process > annotate:hmmscan_viphogs (1) [100%] 1 of 1 ✔ [6d/1e84b5] process > annotate:hmm_postprocessing (1) [100%] 1 of 1 ✔ [88/b20eca] process > annotate:ratio_evalue (1) [100%] 1 of 1 ✔ [6e/1a15cd] process > annotate:annotation (1) [100%] 1 of 1 ✔ [40/82ec13] process > annotate:plot_contig_map (1) [100%] 1 of 1 ✔ [81/e500aa] process > annotate:assign (1) [100%] 1 of 1 ✔ [48/d831c4] process > annotate:blast (1) [100%] 1 of 1 ✔ [4f/070cf9] process > annotate:blast_filter (1) [100%] 1 of 1 ✔ [74/24cf11] process > annotate:hmmscan_rvdb (1) [100%] 1 of 1 ✔ [d5/3c537e] process > annotate:hmmscan_pvogs (1) [100%] 1 of 1 ✔ [16/373266] process > annotate:hmmscan_vogdb (1) [100%] 1 of 1 ✔ [be/8eb74f] process > annotate:hmmscan_vpf (1) [100%] 1 of 1 ✔ [64/aad264] process > annotate:checkV (1) [100%] 1 of 1 ✔ [68/7f16bd] process > plot:generate_krona_table (1) [100%] 2 of 2 ✔ [70/3ccd1f] process > plot:krona (2) [100%] 2 of 2 ✔ [9f/f9cb59] process > plot:generate_sankey_table (2) [100%] 2 of 2 ✔ [37/f2526d] process > plot:sankey (2) [100%] 2 of 2 ✔ Completed at: 31-十二月-2021 19:31:33 Duration : 13m 37s CPU hours : 1.6 Succeeded : 29
link: https://pan.baidu.com/s/1xiJQ8c4nsRNtg3TVw2TSUg extracte code: vabi
Hi @NailouZhang
ok, glad it worked finally. But now you have different issues.
So basically you added
#I added
RUN apt update && apt install libcurl4-openssl-dev libssl-dev -y
to the Dockerfile, then you re-build the image and then it worked fine? If so, thanks for checking and reporting so we could then update the Docker image for Chromomap accordingly @mberacochea .
So basically you are certain that the input sequences are flaviviruses but you don't get any taxonomy assignments, right? This requires checking your input FASTAs in more detail. Are the contigs/sequences relatively short? If so, it's hard for VIRify to assign a taxonomy bc/ the method relies on detectable ORFs. And if the sequence is short and only 1-2 informative ORFs can be found, it's difficult to assign a taxonomy with some certainty. In addition, we already saw that some RNA viruses are more difficult to assign in comparison to DNA viruses (shorter genomes, sometimes less well represented in our HMM model set, ...). Nevertheless, VIRify should find clear cases which we would need to investigate more carefully.
Hi @hoelzer ,
Thank you for your reply. Happy New Year.
[skipped ] process > download_pprmeta:pprmetaGet [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_virsorter_db:virsorterGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_virfinder_db:virfinderGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_model_meta:metaGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_viphog_db:viphogGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_rvdb_db:rvdbGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_pvogs_db:pvogsGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_vogdb_db:vogdbGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_vpf_db:vpfGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_ncbi_db:ncbiGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_imgvr_db:imgvrGetDB [100%] 1 of 1, stored: 1 ✔ [skipped ] process > download_checkv_db:checkvGetDB [100%] 1 of 1, stored: 1 ✔ [d3/0ade86] process > preprocess:rename (1) [100%] 1 of 1, cached: 1 ✔ [ac/b56dc6] process > preprocess:length_filtering (1) [100%] 1 of 1, cached: 1 ✔ [a8/c75806] process > detect:virsorter (1) [100%] 1 of 1, cached: 1 ✔ [92/9ea9f6] process > detect:virfinder (1) [100%] 1 of 1, cached: 1 ✔ [60/1ef0a9] process > detect:pprmeta (1) [100%] 1 of 1, cached: 1 ✔ [f0/82185a] process > detect:parse (1) [100%] 1 of 1, cached: 1 ✔ [f5/8abf7d] process > postprocess:restore (1) [100%] 1 of 1, cached: 1 ✔ [92/98ab09] process > annotate:prodigal (1) [100%] 1 of 1, cached: 1 ✔ [e2/713785] process > annotate:hmmscan_viphogs (1) [100%] 1 of 1, cached: 1 ✔ [6d/1e84b5] process > annotate:hmm_postprocessing (1) [100%] 1 of 1, cached: 1 ✔ [88/b20eca] process > annotate:ratio_evalue (1) [100%] 1 of 1, cached: 1 ✔ [6e/1a15cd] process > annotate:annotation (1) [100%] 1 of 1, cached: 1 ✔ [40/82ec13] process > annotate:plot_contig_map (1) [100%] 1 of 1, cached: 1 ✔ [81/e500aa] process > annotate:assign (1) [100%] 1 of 1, cached: 1 ✔ [48/d831c4] process > annotate:blast (1) [100%] 1 of 1, cached: 1 ✔ [4f/070cf9] process > annotate:blast_filter (1) [100%] 1 of 1, cached: 1 ✔ [74/24cf11] process > annotate:hmmscan_rvdb (1) [100%] 1 of 1, cached: 1 ✔ [d5/3c537e] process > annotate:hmmscan_pvogs (1) [100%] 1 of 1, cached: 1 ✔ [16/373266] process > annotate:hmmscan_vogdb (1) [100%] 1 of 1, cached: 1 ✔ [be/8eb74f] process > annotate:hmmscan_vpf (1) [100%] 1 of 1, cached: 1 ✔ [64/aad264] process > annotate:checkV (1) [100%] 1 of 1, cached: 1 ✔ [3c/b676ea] process > plot:generate_krona_table (2) [100%] 2 of 2, cached: 2 ✔ [70/3ccd1f] process > plot:krona (1) [100%] 2 of 2, cached: 2 ✔ [9f/f9cb59] process > plot:generate_sankey_table (1) [100%] 2 of 2, cached: 2 ✔ [37/f2526d] process > plot:sankey (2) [100%] 2 of 2, cached: 2 ✔ [49/af036d] process > plot:generate_chromomap_table (1) [100%] 2 of 2 ✔ [5c/20110c] process > plot:chromomap (1) [100%] 1 of 1, failed: 1 [skipping] Stored process > download_ncbi_db:ncbiGetDB [skipping] Stored process > download_virfinder_db:virfinderGetDB [skipping] Stored process > download_vogdb_db:vogdbGetDB [skipping] Stored process > download_model_meta:metaGetDB [skipping] Stored process > download_pvogs_db:pvogsGetDB [skipping] Stored process > download_vpf_db:vpfGetDB [skipping] Stored process > download_virsorter_db:virsorterGetDB [skipping] Stored process > download_viphog_db:viphogGetDB [skipping] Stored process > download_imgvr_db:imgvrGetDB [skipping] Stored process > download_rvdb_db:rvdbGetDB [skipping] Stored process > download_checkv_db:checkvGetDB [skipping] Stored process > download_pprmeta:pprmetaGet Error executing process > 'plot:chromomap (2)'
Caused by:
Process plot:chromomap (2)
terminated with an error exit status (1)
Command executed:
library(chromoMap) library(ggplot2) library(plotly)
contigs <- list() annos <- list() contigs <- dir(pattern = ".contigs.txt") annos <- dir(pattern = ".anno.txt")
for (k in 1:length(contigs)){ c = contigs[k] a = annos[k]
# check if a file is empty
if (file.info(c)$size == 0 || file.info(a)$size == 0) {
next
}
# check how many categories we have
categories <- c("limegreen", "orange","grey")
df <- read.table(a, sep = "\t")
set <- unique(df$V5)
if ( length(set) == 2 ) {
if ( set[1] == 'High confidence' && set[2] == 'Low confidence') {
categories <- c("limegreen", "orange")
}
if ( set[1] == 'High confidence' && set[2] == 'No hit') {
categories <- c("limegreen", "grey")
}
if ( set[1] == 'Low confidence' && set[2] == 'No hit') {
categories <- c("orange", "grey")
}
}
if ( length(set) == 1 ) {
if ( set[1] == 'High confidence') {
categories <- c("limegreen")
}
if ( set[1] == 'Low confidence') {
categories <- c("orange")
}
if ( set[1] == 'No hit') {
categories <- c("grey")
}
}
p <- chromoMap(c, a,
data_based_color_map = T,
data_type = "categorical",
data_colors = list(categories),
legend = T, lg_y = 400, lg_x = 100, segment_annotation = T,
left_margin = 100, canvas_width = 1000, chr_length = 8, ch_gap = 6)
htmlwidgets::saveWidget(as_widget(p), paste("Flavivirus.chromomap-", k, ".html", sep=''))
}
Command exit status: 1
Command output: (empty)
Command error:
Attaching package: ‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
Error in chromoMap(c, a, data_based_color_map = T, data_type = "categorical", : unused arguments (data_based_color_map = T, data_type = "categorical", data_colors = list(categories), legend = T, lg_y = 400, lg_x = 100, segment_annotation = T, left_margin = 100, canvas_width = 1000, chr_length = 8, ch_gap = 6) Execution halted
Work dir: /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work/70/44b6c3ddeddc8aaa8f5dc533752571
Tip: when you have fixed the problem you can continue the execution adding the option -resume
to the run command line
I download these sequences through the following links: https://www.ncbi.nlm.nih.gov/nuccore?term=%28%22Flavivirus%22%5BOrganism%5D%20OR%20Flavivirus%5BAll%20Fields%5D%29%20AND%20%28viruses%5Bfilter%5D%20AND%20refseq%5Bfilter%5D%29&cmd=DetailsSearch ("Flavivirus"[Organism] OR ("Flavivirus"[Organism] OR Flavivirus[All Fields])) AND (viruses[filter] AND refseq[filter])
("Alphacoronavirus"[Organism] OR "Betacoronavirus"[Organism] OR "Gammacoronavirus"[Organism] OR coronavirus[All Fields]) AND (viruses[filter] AND biomol_genomic[PROP] AND refseq[filter])
Hi @NailouZhang , also a happy and healthy '22 to you!
Thanks for the detailed reporting.
1) So chromomap
still fails? Or do you were able to solve this by modifying the Docker container?
2) Flavivirus detection
There is only one ORF (code polyproteins) in flaviviruses. So, it may be hard to categorize.
Yes, this is unfortunately a current limitation of VIRify. It might be possible to tune some parameters to get these recognized, but while also increasing the false-positive detection rate. I think this might be doable via this parameter:
help="Minimum number of proteins with ViPhOG annotations at each taxonomic level, required for taxonomic assignment (default: 2)",
For your use-case, it might be reasonable to allow the user to easily adjust this parameter. I will open a separate issue about this topic because currently the pipeline does not provide a parameter to easily change this.
3) Coronavirus test
Great, nice to hear. We also had good experiences w/ contigs derived from Coronavirus reads.
Hi @hoelzer ,
I don't know what happened, chromomap not working ):.
Hi @NailouZhang,
Thank you for the extensive error report. I'll try to look at this next week, I'm busy at the moment.
@hoelzer I'll update the docker container for Chromomap.
Cheers
It tool long... but I finally added those lines to the container and pushed a new version of it.
TBH, I wasn't able to test it. I will
I run command as fellow: cd ~/20T/DataBase/SoftwaresEnsembel/MAG git clone --recursive https://github.com/EBI-Metagenomics/emg-viral-pipeline.git
export PATH=$PATH:~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline/bin
cd /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27
~/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run virify.nf --help
~/Softwares/Miniconda3/nextflow-21.03.0-edge/nextflow run -resume \ ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline/virify.nf \ --fasta "/home/stone/20T/SraDownload/Genome/TBEV/NC_001672.1_sequence.fasta" \ --cores 4 \ --output /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27 \ --workdir /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work \ --databases ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline/DATABASES \ --cachedir ~/20T/DataBase/SoftwaresEnsembel/MAG/emg-viral-pipeline/SINGULARITY \ -profile local,singularity
I got: Error executing process > 'preprocess:length_filtering (1)'
Caused by: Process
preprocess:length_filtering (1)
terminated with an error exit status (127)Command executed:
filter_contigs_len.py -f NC_001672_renamed.fasta -l 1.5 -o ./ CONTIGS=$(grep ">" NC_001672filt.fasta | wc -l)
Command exit status: 127
Command output: (empty)
Command error: .command.sh: line 2: filter_contigs_len.py: command not found
Work dir: /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/test_2021_12_27/work/f6/9db03ebd837a74d632361cd0f07d79
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named
.command.sh
How can I resolve this?