cmmr / EsViritu

Read mapping pipeline for detection and measurement of virus pathogens from metagenomic or clinical data
MIT License
21 stars 3 forks source link

error for virus_pathogen_database.mmi #12

Open tulikasriv26 opened 4 weeks ago

tulikasriv26 commented 4 weeks ago

I am continously getting this error for paired -ended files Processing sample: BBW_0513_FS_CKDL240028581-1A_22FWVLLT4_L4 Time Update: Starting main bash script for EsViritu General Mode @ 10-02-24---23:16:26 can't find minimap2 index for pipeline. should be: /home/tulika.bhardwaj/miniconda3/envs/EsViritu/lib/python3.12/site-packages/EsViritu/virus_pathogen_database.mmi exiting /home/............/miniconda3/envs/EsViritu/lib/python3.12/site-packages/EsViritu 2 paired read files DB: /home/............./miniconda3/envs/EsViritu/lib/python3.12/site-packages/EsViritu filter seqs: /........../lib/python3.12/site-packages/EsViritu version 0.2.3 0 coverm found minimap2 found samtools found bioawk found seqkit found bedtools found fastp found seqfu found

And this is the script raw="/work/ebg_lab/gm/wastewater_data/usftp21.novogene.com/01.RawData/EsViritu" out="/work/ebg_lab/gm/wastewater_data/usftp21.novogene.com/01.RawData/EsViritu_result"

##################################################################################################### conda env config vars set ESVIRITU_DB=/work/ebg_lab/gm/wastewater_data/usftp21.novogene.com/01.RawData/EsViritu/DBs/v2.0.2 mkdir -p $out while read p do echo "Processing sample: $p"
EsViritu -r $raw/${p}_1.fq $raw/${p}_2.fq -s ${p} -t 16 -o $out/$p -p paired done < list

mtisza1 commented 3 weeks ago

Hi, thanks for opening the issue.

This may or may not have changed in recent conda updates, but any time I use the command: conda env config vars set, it prompts me to deactivate/reactivate the environment for this change to take effect. So, I think that's your problem.

In your case, you can simply use the --db flag in your EsViritu command, e.g.:

--db /work/ebg_lab/gm/wastewater_data/usftp21.novogene.com/01.RawData/EsViritu/DBs/v2.0.2

Let me know if this helps.

Best,

Mike

tulikasriv26 commented 3 weeks ago

Thanks for your reply. I must try your suggestions and will let you know.

On Wed, Oct 9, 2024 at 11:35 AM Mike Tisza @.***> wrote:

Hi, thanks for opening the issue.

This may or may not have changed in recent conda updates, but any time I use the command: conda env config vars set, it prompts me to deactivate/reactivate the environment for this change to take effect. So, I think that's your problem.

In your case, you can simply use the --db flag in your EsViritu command, e.g.:

--db /work/ebg_lab/gm/wastewater_data/ usftp21.novogene.com/01.RawData/EsViritu/DBs/v2.0.2

Let me know if this helps.

Best,

Mike

— Reply to this email directly, view it on GitHub https://github.com/cmmr/EsViritu/issues/12#issuecomment-2402910222, or unsubscribe https://github.com/notifications/unsubscribe-auth/AINHNVEXEODOCBWP45LSOE3Z2VSOXAVCNFSM6AAAAABPJAVYS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBSHEYTAMRSGI . You are receiving this because you authored the thread.Message ID: @.***>

tulikasriv26 commented 3 weeks ago

i am facing some new issue while following your suggestion

[M::worker_pipeline::65100.971*15.49] mapped 333334 sequences

[2024-10-09T23:15:29Z ERROR coverm::bam_generator] The STDERR for the samtools sort part was: samtools sort: failed writing to "/tmp/coverm_fifo.mMCAlvTztJmh/coverm-make-samtools-sortlmOtNP.0020.bam": No such file or directory

[2024-10-09T23:15:29Z ERROR coverm::bam_generator] The STDERR for the samtools view for cache part was: [main_samview] fail to read the header from "-".

[2024-10-09T23:15:29Z ERROR coverm::bam_generator] The STDERR for the remove_minimap2_duplicated_headers part was: thread 'main' panicked at 'failed printing to stdout: Broken pipe (os error 32)', library/std/src/io/stdio.rs:1008:9

note: run with `RUST_BACKTRACE=1` environment variable to display a

backtrace

[2024-10-09T23:15:29Z ERROR coverm::bam_generator] Cannot continue since mapping failed.

On Wed, Oct 9, 2024 at 12:44 PM Tulika Bhardwaj @.***> wrote:

Thanks for your reply. I must try your suggestions and will let you know.

On Wed, Oct 9, 2024 at 11:35 AM Mike Tisza @.***> wrote:

Hi, thanks for opening the issue.

This may or may not have changed in recent conda updates, but any time I use the command: conda env config vars set, it prompts me to deactivate/reactivate the environment for this change to take effect. So, I think that's your problem.

In your case, you can simply use the --db flag in your EsViritu command, e.g.:

--db /work/ebg_lab/gm/wastewater_data/ usftp21.novogene.com/01.RawData/EsViritu/DBs/v2.0.2

Let me know if this helps.

Best,

Mike

— Reply to this email directly, view it on GitHub https://github.com/cmmr/EsViritu/issues/12#issuecomment-2402910222, or unsubscribe https://github.com/notifications/unsubscribe-auth/AINHNVEXEODOCBWP45LSOE3Z2VSOXAVCNFSM6AAAAABPJAVYS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBSHEYTAMRSGI . You are receiving this because you authored the thread.Message ID: @.***>

mtisza1 commented 3 weeks ago

Hi,

This looks like a coverm error. It seems due to the /tmp directory not existing. There's a way to go around this and set an alternative tmp path:

https://wwood.github.io/CoverM/coverm-genome.html#frequently-asked-questions-faq

Alternatively, it's possible your computer/node is running out of RAM. You can sample down your files with seqkit head or a similar command to check this.

Mike