I am running into trouble identifying viral contigs from a metagenomic dataset. I am running VirSorter2 on ~130 metagenomic samples utilizing metagenome-assembled scaffolds as the input. Some of the samples were able to run to completion returning the appropriate .tsv outputs successfully. Meanwhile, a handful of samples (32 samples) failed to run. I ran each of these samples interactively in an HPC environment and all of them returned the following tracebacks:
$ singularity exec ${VIRSORTER2_DIRECTORY}/virsorter2.sif virsorter run -w /storage/home/${USER}/scratch/virsorter2/SRR6122134 -i /storage/home/${USER}/scratch/assembly/SRR6122134/spades/scaffolds.fasta -j 15 all --latency-wait 1200
[2024-03-11 15:43 INFO] VirSorter 2.2.3
[2024-03-11 15:43 INFO] /usr/local/bin/virsorter run -w /storage/home/${USER}/scratch/virsorter2/SRR6122134 -i /storage/home/${USER}/scratch/assembly/SRR6122134/spades/scaffolds.fasta -j 15 all --latency-wait 1200
[2024-03-11 15:43 INFO] Using /usr/local/lib/python3.9/site-packages/virsorter/template-config.yaml as config template
[2024-03-11 15:43 INFO] conig file written to /scratch/${USER}/virsorter2/SRR6122134/config.yaml
[2024-03-11 15:43 INFO] Executing: snakemake --snakefile /usr/local/lib/python3.9/site-packages/virsorter/Snakefile --directory /scratch/${USER}/virsorter2/SRR6122134 --jobs 15 --configfile /scratch/${USER}/virsorter2/SRR6122134/config.yaml --latency-wait 600 --rerun-incomplete --nolock --conda-frontend mamba --conda-prefix /db/conda_envs --use-conda --quiet all --latency-wait 1200
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/snakemake/__init__.py", line 662, in snakemake
success = workflow.execute(
File "/usr/local/lib/python3.9/site-packages/snakemake/workflow.py", line 690, in execute
dag.check_incomplete()
File "/usr/local/lib/python3.9/site-packages/snakemake/dag.py", line 301, in check_incomplete
incomplete = self.incomplete_files
File "/usr/local/lib/python3.9/site-packages/snakemake/dag.py", line 421, in incomplete_files
chain(
File "/usr/local/lib/python3.9/site-packages/snakemake/dag.py", line 422, in <genexpr>
*(
File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 209, in incomplete
return any(map(lambda f: f.exists and marked_incomplete(f), job.output))
File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 209, in <lambda>
return any(map(lambda f: f.exists and marked_incomplete(f), job.output))
File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 207, in marked_incomplete
return self._read_record(self._metadata_path, f).get("incomplete", False)
File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 326, in _read_record_cached
return self._read_record_uncached(subject, id)
File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 332, in _read_record_uncached
return json.load(f)
File "/usr/local/lib/python3.9/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.9/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Any help on how to proceed would be greatly appreciated. It would be worth noting that I had previously gone through the log files for some of these samples and was directed to increase the latency time as there may have been issues with writing and reading files on the HPC, which is why I included the snakemake option of --latency-wait.
Also of note... I am just running virsorter2 to overwrite its updated output to a previously established directory that had previously run into errors running the samples. Not sure if that may cause an issue with snakemake.
Hi, this is a know bug of snakemake when re-running things. You can remove the following directory to solve the problem: /storage/home/${USER}/scratch/virsorter2/SRR6122134/.snakemake/metadata/.
Hello,
I am running into trouble identifying viral contigs from a metagenomic dataset. I am running VirSorter2 on ~130 metagenomic samples utilizing metagenome-assembled scaffolds as the input. Some of the samples were able to run to completion returning the appropriate .tsv outputs successfully. Meanwhile, a handful of samples (32 samples) failed to run. I ran each of these samples interactively in an HPC environment and all of them returned the following tracebacks:
Any help on how to proceed would be greatly appreciated. It would be worth noting that I had previously gone through the log files for some of these samples and was directed to increase the latency time as there may have been issues with writing and reading files on the HPC, which is why I included the snakemake option of
--latency-wait
.Also of note... I am just running virsorter2 to overwrite its updated output to a previously established directory that had previously run into errors running the samples. Not sure if that may cause an issue with snakemake.