jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
219 stars 30 forks source link

I only can run 1 single sample ,but cant run the merged file #152

Open 778055611 opened 1 year ago

778055611 commented 1 year ago

Dear Virsorter2 team, thank you again for your awesome work ,but when I use a single sample to test using the following code " virsorter run -i ~/testfile/FC03.contigs.fa -w ~/result/Virsorter2/total-result/result --min-length 1500 -j 144 all" ,It works well. But when I merged all of my contigs.fa files to run " virsorter run -i ~/result/megahit/total.contigs.fa -w ~/result/Virsorter2/total-result --min-length 1500 -j 144 all",Imet the following error:"Error in rule circular_linear_split: jobid: 8 output: iter-0/pp-seqname-length.tsv conda-env: /share/lu/VirSorter2-Workpath/db/conda_envs/2270d576 shell:

    # prep_logdir
    mkdir -p log/iter-0/step1-pp log/iter-0/step2-extract-feature log/iter-0/step3-classify

    Cnt=$(grep -c '^>' /share/luoxiao2/result/megahit/total.contigs.fa)
    if [ ${Cnt} = 0 ]; then
        echo "No sequnences found in contig file; exiting"               | python /share/luoxiao2/.conda/envs/vs2/lib/python3.10/site-packages/virsorter/./scripts/echo.py --level error
        exit 1
    fi

    python /share/luoxiao2/.conda/envs/vs2/lib/python3.10/site-packages/virsorter/./scripts/circular-linear-split.py           /share/luoxiao2/result/megahit/total.contigs.fa           iter-0/pp-circular.fna.preext          iter-0/pp-linear.fna           iter-0/pp-seqname-length.tsv           "||rbs:common"           1500

    if [ ! -s iter-0/pp-circular.fna.preext ]; then
        echo "No circular seqs found in contig file"               | python /share/luoxiao2/.conda/envs/vs2/lib/python3.10/site-packages/virsorter/./scripts/echo.py
        rm iter-0/pp-circular.fna.preext
    else
        python /share/luoxiao2/.conda/envs/vs2/lib/python3.10/site-packages/virsorter/./scripts/circular-extend.py               iter-0/pp-circular.fna.preext iter-0/pp-circular.fna
    fi

    if [ ! -s iter-0/pp-linear.fna ]; then
        echo "No linear seqs found in contig file"               | python /share/luoxiao2/.conda/envs/vs2/lib/python3.10/site-packages/virsorter/./scripts/echo.py
        rm iter-0/pp-linear.fna
    fi

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Exiting because a job execution failed. Look above for error message

*** An error occurred. Detailed errors may not be printed for certain rules. Refer to the log file of the failed command for troubleshooting Issues can be raised at: https://github.com/jiarong/VirSorter2/issues Or send an email to virsorter2 near gmail.com if you do not use GitHub"

could you please give me some advice ?thank you again.

778055611 commented 1 year ago

so I want to know ,whether I need split my input file? Is it too large to run Virsorter2?

jiarong commented 1 year ago

You do not need to split with 144 CPU cores. The error might be your input file path (~/result/megahit/total.contigs.fa) does not exist or it's empty..

778055611 commented 1 year ago

You do not need to split with 144 CPU cores. The error might be your input file path (~/result/megahit/total.contigs.fa) does not exist or it's empty..

thanks for your reply,maybe it is because that my -j parameter too large. and I want to know how to set --min-length parameter? Am I just to set --min-length=1000? could you please give me some advice. thank you again

jiarong commented 1 year ago

I recommend 5kb as the minimal length. See details in here: https://www.protocols.io/view/viral-sequence-identification-sop-with-virsorter2-5qpvoyqebg4o/v3

778055611 commented 1 year ago

I recommend 5kb as the minimal length. See details in here: https://www.protocols.io/view/viral-sequence-identification-sop-with-virsorter2-5qpvoyqebg4o/v3

thanks for your reply which heleps me a lot. thank you again.