Closed ktpolanski closed 6 months ago
Hi @ktpolanski
Sorry that you're having issues with the workflow. The error OSError: sam_write1 failed with error code -1
is sometimes seen when there no disk space left. I'm not sure this is the issue as -resume ing worked.. Could you please check you have enough disk space and that there is space left in $TMPDIR.
I definitely had space on the drive itself. My $TMPDIR
is currently unset. I guess I can try the export TMPDIR = /some/path/on/large/drive
suggestion that came up in a different issue if I run into this again?
Yes, that's what I would try first. Please let me know if that does not work for you.
So I tried a combination of things - dialled back the resource use further in the config in case that matters, and added the TMPDIR
as suggested, just ahead of the nextflow call:
export TMPDIR=/mnt/tmpdir
~/nextflow-23.12.0-edge-all run epi2me-labs/wf-single-cell \
--fastq fastq/ \
--kit_name multiome \
--kit_version v1 \
--expected_cells 5000 \
--ref_genome_dir /home/ubuntu/cellranger/GRCh38-2020-A/ \
--sample $SAMPLE \
-c openstack.cfg \
-profile standard
Unfortunately I encountered the same error again. I'd like to think it's not space related because the drive has over a terabyte free on it right now:
$ df -h
Filesystem Size Used Avail Use% Mounted on
[...]
/dev/vdb 2.0T 790G 1.1T 42% /mnt
[...]
Of note, I'm rerunning the samples again with a slightly modded (#85) v1.1.0 under singularity, and the stringies all went through without any sort of hiccupping.
Operating System
Other Linux (please specify below)
Other Linux
18.04.4
Workflow Version
v1.0.1-ga6a1b69
Workflow Execution
Command line
EPI2ME Version
No response
CLI command run
Workflow Execution - CLI Execution Profile
standard (default)
What happened?
I'm running the workflow locally on an OpenStack instance with a decent number of cores and RAM. The stringtie process appears to somehow short the instance out on resources in a strange way. Once my first sample got tanked by this, I went into the docs and found the recommendation to make a local config. As such, I did so, with fewer cores (23 vs 26) and RAM (200GB vs ~220GB) than the instance has available. Three of the four samples ran fine, including past the stringtie step, but one kept snagging there repeatedly.
I was able to get everything across the finish line by
-resume
'ing the pipeline, sometimes repeatedly. Still, it would be nice to not have to babysit this.Attached below is the
.command.err
for an example erroring job.Relevant log output
Application activity log entry
No response