jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
225 stars 31 forks source link

JSONDecodeError #192

Open shaconn opened 8 months ago

shaconn commented 8 months ago

Hello,

I am running into trouble identifying viral contigs from a metagenomic dataset. I am running VirSorter2 on ~130 metagenomic samples utilizing metagenome-assembled scaffolds as the input. Some of the samples were able to run to completion returning the appropriate .tsv outputs successfully. Meanwhile, a handful of samples (32 samples) failed to run. I ran each of these samples interactively in an HPC environment and all of them returned the following tracebacks:

$ singularity exec ${VIRSORTER2_DIRECTORY}/virsorter2.sif virsorter run -w /storage/home/${USER}/scratch/virsorter2/SRR6122134 -i /storage/home/${USER}/scratch/assembly/SRR6122134/spades/scaffolds.fasta -j 15 all --latency-wait 1200
[2024-03-11 15:43 INFO] VirSorter 2.2.3
[2024-03-11 15:43 INFO] /usr/local/bin/virsorter run -w /storage/home/${USER}/scratch/virsorter2/SRR6122134 -i /storage/home/${USER}/scratch/assembly/SRR6122134/spades/scaffolds.fasta -j 15 all --latency-wait 1200
[2024-03-11 15:43 INFO] Using /usr/local/lib/python3.9/site-packages/virsorter/template-config.yaml as config template
[2024-03-11 15:43 INFO] conig file written to /scratch/${USER}/virsorter2/SRR6122134/config.yaml

[2024-03-11 15:43 INFO] Executing: snakemake --snakefile /usr/local/lib/python3.9/site-packages/virsorter/Snakefile --directory /scratch/${USER}/virsorter2/SRR6122134 --jobs 15 --configfile /scratch/${USER}/virsorter2/SRR6122134/config.yaml --latency-wait 600 --rerun-incomplete --nolock  --conda-frontend mamba --conda-prefix /db/conda_envs --use-conda    --quiet  all  --latency-wait 1200 
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/snakemake/__init__.py", line 662, in snakemake
    success = workflow.execute(
  File "/usr/local/lib/python3.9/site-packages/snakemake/workflow.py", line 690, in execute
    dag.check_incomplete()
  File "/usr/local/lib/python3.9/site-packages/snakemake/dag.py", line 301, in check_incomplete
    incomplete = self.incomplete_files
  File "/usr/local/lib/python3.9/site-packages/snakemake/dag.py", line 421, in incomplete_files
    chain(
  File "/usr/local/lib/python3.9/site-packages/snakemake/dag.py", line 422, in <genexpr>
    *(
  File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 209, in incomplete
    return any(map(lambda f: f.exists and marked_incomplete(f), job.output))
  File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 209, in <lambda>
    return any(map(lambda f: f.exists and marked_incomplete(f), job.output))
  File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 207, in marked_incomplete
    return self._read_record(self._metadata_path, f).get("incomplete", False)
  File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 326, in _read_record_cached
    return self._read_record_uncached(subject, id)
  File "/usr/local/lib/python3.9/site-packages/snakemake/persistence.py", line 332, in _read_record_uncached
    return json.load(f)
  File "/usr/local/lib/python3.9/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Any help on how to proceed would be greatly appreciated. It would be worth noting that I had previously gone through the log files for some of these samples and was directed to increase the latency time as there may have been issues with writing and reading files on the HPC, which is why I included the snakemake option of --latency-wait.

Also of note... I am just running virsorter2 to overwrite its updated output to a previously established directory that had previously run into errors running the samples. Not sure if that may cause an issue with snakemake.

jiarong commented 8 months ago

Hi, this is a know bug of snakemake when re-running things. You can remove the following directory to solve the problem: /storage/home/${USER}/scratch/virsorter2/SRR6122134/.snakemake/metadata/.