Closed SaraiFinks closed 1 year ago
when reinstalling this version of MetaPhage and running:
mamba env create -n metaphage2 --file "$METAPHAGE_DIR"/deps/env-v2.yaml
I get...
r-cli ==3.0.1 r41hc72bb7e_0 does not exist (perhaps a typo or a missing channel);
Hi, for the r-cli thing, the problem is in the version of this package. Replace in the env-v2.yaml file the r-cli line you have by this one :
r-cli=3.3.0=r41h7525677_0
Hi,
Thanks! Revising the env-v2.yaml file worked.
Now getting these errors running the following script:
"$METAPHAGE_DIR"/bin/python/db_manager.py -o "$METAPHAGE_DIR"/db/ -m 6 -r 2022.1
📂 Downloading bundle 2022.1: 7 databases total
phix found: skipping
kraken2 found: skipping
phigaro found: skipping
vibrant found: skipping
virsorter found: skipping
📦 Preparing to download CheckV database
📦 Preparing to download vConTACT2
Error downloading https://s3.climb.ac.uk/ifrqmra-metaphage/v2.0/checkv.tar.gz:
HTTP Error 404: Not Found
✅ CheckV database downloaded
tar (child): checkv.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove 'checkv.tar.gz': No such file or directory
✅ vConTACT2 downloaded
tar (child): inphared.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove 'inphared.tar.gz': No such file or directory
2 packages downloaded in 12.75 seconds (cumulative 13.09 seconds)
These errors are due to the fact that the links provided to download the databases are not functional anymore.
Open metaphage/db, create the file you are missing among theses ones:
The file names in db have to be these ones : "checkv" "inphared" "kraken2" "phigaro" "phix" "vibrant" "virsorter"
Then Download manually the specific databases by finding them yourself via google (e.g. inphared database), unzip them and add them manually to related file in /db.
Hope it helps, let me know if you don't find the databases.
You can download an updated db_manager, with updated links, from here: https://raw.githubusercontent.com/MattiaPandolfoVR/MetaPhage/dev/bin/python/db_manager.py
Thanks JordanVV and telatin, I was able to install everything!
WARN: Killing pending tasks (4)
executor > local (44) [4e/034777] process > csv_validator (Checking met... [100%] 1 of 1 ✔ [4c/db9b0a] process > db_manager (Downloading mis... [100%] 1 of 1 ✔ [c2/16fdb6] process > fastp (SRR8653245) [100%] 10 of 10 ✔ [0c/361b70] process > remove_phix (SRR8652969) [100%] 10 of 10 ✔ [07/ec88bb] process > kraken2 (SRR8652969) [100%] 8 of 8 [- ] process > krona [ 0%] 0 of 8 [00/177922] process > megahit (SRR8652969) [100%] 9 of 9 [- ] process > metaquast - [9a/8a07d5] process > deepvirfinder (megahit-SRR8... [ 0%] 0 of 8 [- ] process > phigaro [ 0%] 0 of 9 [- ] process > vibrant [ 0%] 0 of 9 [de/42f554] process > virfinder (megahit-SRR8653090) [ 11%] 1 of 9, failed: 1 [- ] process > virsorter2 [ 0%] 0 of 9 [- ] process > cdhit - [- ] process > checkV - [- ] process > prodigal - [- ] process > bowtie2_derep - [- ] process > covtocounts2 - [- ] process > diamond_vcontact2 - [- ] process > vcontact2 - [- ] process > graphanalyzer - [- ] process > kraken_file - [- ] process > miner_comparison - [- ] process > checkv_table - [- ] process > summary - [- ] process > phylo_obj - [- ] process > file_chopper - [- ] process > taxonomy_table - [- ] process > alpha_diversity - [- ] process > beta_diversity - [- ] process > heatmap - [- ] process > violin_plots - [- ] process > multiqc - Error executing process > 'virfinder (megahit-SRR8653090)'
Caused by: Process
virfinder (megahit-SRR8653090)
terminated with an error exit status (1)Command executed:
Rscript /scratch/xxxx/MetaPhage/bin/Rscript/virfinder_execute.R SRR8653090_megahit_contigs.fasta 8 /scratch/xxxx/MetaPhage mv results.txt SRR8653090_results.txt mv viral_sequences.fasta SRR8653090_viral_sequences.fasta
since virfinder outputs a list of viral scaffolds headers, we need to collect these and extract the related viral sequence only if they respect the pvalue threshold
python /scratch/xxxxx/MetaPhage/bin/python/pvalue_virfinder.py SRR8653090
join the viral scaffold header with sequence
seqtk subseq ./SRR8653090_megahit_contigs.fasta ./SRR8653090_filtered_headers.txt > SRR8653090_viral_tmp_sequences.fasta
add miner flag at each fasta header
sed 's/^>/>virfinder_/' SRR8653090_viral_tmp_sequences.fasta > SRR8653090_viral_sequences.fasta rm SRR8653090_viral_tmp_sequences.fasta
Command exit status: 1
Command output: (empty)
Command error:
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors Loading required package: stats4
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:Matrix’:
expand, unname
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Attaching package: ‘IRanges’
The following object is masked from ‘package:VirFinder’:
reverse
Loading required package: XVector Loading required package: GenomeInfoDb Error: package or namespace load failed for ‘GenomeInfoDb’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘GenomeInfoDbData’ Failed with error: ‘package ‘GenomeInfoDb’ could not be loaded’ Loading required package: parallel Loading required package: Biostrings Loading required package: GenomeInfoDb Error: package or namespace load failed for ‘GenomeInfoDb’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘GenomeInfoDbData’ Failed with error: ‘package ‘GenomeInfoDb’ could not be loaded’ Error in readDNAStringSet(inFaFile) : could not find function "readDNAStringSet" Calls: parVF.pred Execution halted
Work dir: /scratch/xxxx/MetaPhage/work/de/42f554c8476418693669732ae2c46e
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
slurmstepd: error: JOB 4280770 STEPD TERMINATED ON p-sc-2022 AT 2023-06-27T16:06:27 DUE TO JOB NOT ENDING WITH SIGNALS slurmstepd: error: Container 1369132 in cgroup plugin has 2 processes, giving up after 159 sec
Hi, SaraiFinks.
How do you fix this error? I'm having the same mistake, but I actually don't find how to fix it. Do I need a specific version of R? I appreciate your help.
WARN: Killing pending tasks (4)
executor > local (44) [4e/034777] process > csv_validator (Checking met... [100%] 1 of 1 ✔ [4c/db9b0a] process > db_manager (Downloading mis... [100%] 1 of 1 ✔ [c2/16fdb6] process > fastp (SRR8653245) [100%] 10 of 10 ✔ [0c/361b70] process > remove_phix (SRR8652969) [100%] 10 of 10 ✔ [07/ec88bb] process > kraken2 (SRR8652969) [100%] 8 of 8 [- ] process > krona [ 0%] 0 of 8 [00/177922] process > megahit (SRR8652969) [100%] 9 of 9 [- ] process > metaquast - [9a/8a07d5] process > deepvirfinder (megahit-SRR8... [ 0%] 0 of 8 [- ] process > phigaro [ 0%] 0 of 9 [- ] process > vibrant [ 0%] 0 of 9 [de/42f554] process > virfinder (megahit-SRR8653090) [ 11%] 1 of 9, failed: 1 [- ] process > virsorter2 [ 0%] 0 of 9 [- ] process > cdhit - [- ] process > checkV - [- ] process > prodigal - [- ] process > bowtie2_derep - [- ] process > covtocounts2 - [- ] process > diamond_vcontact2 - [- ] process > vcontact2 - [- ] process > graphanalyzer - [- ] process > kraken_file - [- ] process > miner_comparison - [- ] process > checkv_table - [- ] process > summary - [- ] process > phylo_obj - [- ] process > file_chopper - [- ] process > taxonomy_table - [- ] process > alpha_diversity - [- ] process > beta_diversity - [- ] process > heatmap - [- ] process > violin_plots - [- ] process > multiqc - Error executing process > 'virfinder (megahit-SRR8653090)'
Caused by: Process
virfinder (megahit-SRR8653090)
terminated with an error exit status (1)Command executed:
Rscript /scratch/xxxx/MetaPhage/bin/Rscript/virfinder_execute.R SRR8653090_megahit_contigs.fasta 8 /scratch/xxxx/MetaPhage mv results.txt SRR8653090_results.txt mv viral_sequences.fasta SRR8653090_viral_sequences.fasta
since virfinder outputs a list of viral scaffolds headers, we need to collect these and extract the related viral sequence only if they respect the pvalue threshold
python /scratch/xxxxx/MetaPhage/bin/python/pvalue_virfinder.py SRR8653090
join the viral scaffold header with sequence
seqtk subseq ./SRR8653090_megahit_contigs.fasta ./SRR8653090_filtered_headers.txt > SRR8653090_viral_tmp_sequences.fasta
add miner flag at each fasta header
sed 's/^>/>virfinder_/' SRR8653090_viral_tmp_sequences.fasta > SRR8653090_viral_sequences.fasta rm SRR8653090_viral_tmp_sequences.fasta
Command exit status: 1
Command output: (empty)
Command error:
The following objects are masked from ‘package:stats’:
The following objects are masked from ‘package:base’:
Loading required package: S4Vectors Loading required package: stats4
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:Matrix’:
The following objects are masked from ‘package:base’:
Loading required package: IRanges
Attaching package: ‘IRanges’
The following object is masked from ‘package:VirFinder’:
Loading required package: XVector Loading required package: GenomeInfoDb Error: package or namespace load failed for ‘GenomeInfoDb’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘GenomeInfoDbData’ Failed with error: ‘package ‘GenomeInfoDb’ could not be loaded’ Loading required package: parallel Loading required package: Biostrings Loading required package: GenomeInfoDb Error: package or namespace load failed for ‘GenomeInfoDb’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘GenomeInfoDbData’ Failed with error: ‘package ‘GenomeInfoDb’ could not be loaded’ Error in readDNAStringSet(inFaFile) : could not find function "readDNAStringSet" Calls: parVF.pred Execution halted
Work dir: /scratch/xxxx/MetaPhage/work/de/42f554c8476418693669732ae2c46e
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
slurmstepd: error: JOB 4280770 STEPD TERMINATED ON p-sc-2022 AT 2023-06-27T16:06:27 DUE TO JOB NOT ENDING WITH SIGNALS slurmstepd: error: Container 1369132 in cgroup plugin has 2 processes, giving up after 159 sec