AnantharamanLab / ViWrap

A wrapper to identify, bin, classify, and predict host-viral relationship for viruses
63 stars 15 forks source link

`FileNotFoundError` ViWrap_out/02_vRhyme_outdir/vRhyme_best_bins_fasta_CheckV_result/CheckV_quality_summary.txt' #15

Open Sidduppal opened 1 year ago

Sidduppal commented 1 year ago

Hey, thanks for building the pipeline. I getting the following error while running the tool:

[2023-06-12T16:24:39Z INFO  coverm] CoverM version 0.6.1
[2023-06-12T16:24:39Z INFO  coverm] Using min-read-percent-identity 97%
[2023-06-12T16:30:46Z INFO  coverm] CoverM version 0.6.1
[2023-06-12T16:30:46Z INFO  coverm] Setting single read percent identity threshold at 0.97 for MetaBAT adjusted coverage, and not filtering out supplementary, secondary and improper pair alignments
[2023-06-12T16:30:46Z INFO  coverm] Using min-covered-fraction 0%
[2023-06-12T16:32:17Z INFO  coverm::contig] In sample 'T2R1UO_CKDN220061238-1A_HK7HNDSX5_L1.filtered', found 139477497 reads mapped out of 139477497 total (100.00%)

Traceback (most recent call last):
  File "/media/BRIANDATA2/sidd/third_party_tool/ViWrap/ViWrap", line 173, in <module>
    output = cli()
  File "/media/BRIANDATA2/sidd/third_party_tool/ViWrap/ViWrap", line 167, in cli
    args["func"](args)
  File "/media/BRIANDATA2/sidd/third_party_tool/ViWrap/scripts/master_run.py", line 422, in main
    scripts.module.parse_checkv_result(vRhyme_best_bin_CheckV_result, CheckV_quality_summary)
  File "/media/BRIANDATA2/sidd/third_party_tool/ViWrap/scripts/module.py", line 413, in parse_checkv_result
    f = open(outfile, "w")
FileNotFoundError: [Errno 2] No such file or directory: '/media/BRIANDATA2/sidd/soil_wgs/ViWrap/T2R1UO/ViWrap_out/02_vRhyme_outdir/vRhyme_best_bins_fasta_CheckV_result/CheckV_quality_summary.txt'

I have tried running export CHECKVDB=/path/to/checkv-db as well as conda env config vars set CHECKVDB=/path/to/checkv-db as suggested here. The CheckV database seems to be correctly configured as I'm able to run the test dataset past this step. The only difference I see between the test data and my data is that the test data is using SPades assembly while I'm using MEGAHIT assembly.

Can you please look into it? The same error was been reported in two other issue11 and issue6. Any help will be appreciated.

quliping commented 1 year ago

I also encountered this problem when I run the test data with paramater '--identify_method vs'. I found no viral bins were generated by vRhyme, thus the checkV was obviously unable to complete. Therefore, this error was caused by vRhyme rather than checkV. You can check the output diractory of vRhyme and read the log file to see if any viral bins were generated. This problem was solved when I using my own data. Viral bins were generated and this step was passed. Maybe not all virus identification method are suitable for your data which resulted in insufficient sequences for binning. If you error was caused as what I said, you can try other virus identification method, and if all method or paramater are invalid, maybe you should abandon you data or use another virus identification pipeline which doesen't need the viral binning step.

tynot commented 1 month ago

I also encountered this problem when I run the test data with paramater '--identify_method vs'. I found no viral bins were generated by vRhyme, thus the checkV was obviously unable to complete. Therefore, this error was caused by vRhyme rather than checkV. You can check the output diractory of vRhyme and read the log file to see if any viral bins were generated. This problem was solved when I using my own data. Viral bins were generated and this step was passed. Maybe not all virus identification method are suitable for your data which resulted in insufficient sequences for binning. If you error was caused as what I said, you can try other virus identification method, and if all method or paramater are invalid, maybe you should abandon you data or use another virus identification pipeline which doesen't need the viral binning step.

your answer is helpfull.