This is with CUDA on the cluster, command snakemake -p --profile config/slurm/hemera imagenette2_train. Snakemake itself fails with an error. A bunch of jobs start, then this happens:
Traceback (most recent call last):
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/__init__.py", line 722, in snakemake
success = workflow.execute(
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/workflow.py", line 1110, in execute
success = self.scheduler.schedule()
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/scheduler.py", line 421, in schedule
self._finish_jobs()
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/scheduler.py", line 524, in _finish_jobs
self.get_executor(job).handle_job_success(job)
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 875, in handle_job_success
super().handle_job_success(
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 232, in handle_job_success
job.postprocess(
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/jobs.py", line 1091, in postprocess
self.dag.workflow.persistence.finished(
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/persistence.py", line 239, in finished
starttime = self._read_record(self._metadata_path, f).get(
File "/home/pape58/Code/sota_on_uncertainties/venv/lib/python3.9/site-packages/snakemake/persistence.py", line 423, in _read_record_uncached
return json.load(f)
File "/trinity/shared/pkg/devel/python/3.9.6/lib/python3.9/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/trinity/shared/pkg/devel/python/3.9.6/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/trinity/shared/pkg/devel/python/3.9.6/lib/python3.9/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 1164 (char 1163)
The output right before that (though I'm not sure if it is related or if the results of some other job were collected at that point):
This is with CUDA on the cluster, command
snakemake -p --profile config/slurm/hemera imagenette2_train
. Snakemake itself fails with an error. A bunch of jobs start, then this happens:The output right before that (though I'm not sure if it is related or if the results of some other job were collected at that point):