Closed srynobio closed 2 years ago
hi Shawn,
just to verify, you have --bind /scratch
or something similar in your singularity call?
I fixed a problem with TMPDIR in 0544428 but that was in v0.2.2 so you should have it. And you would have seen an error about that, I think.
If you export TMPDIR, then slivar will use it.
Yes I pass runOptions ="-B /scratch/:/scratch -B /uufs/:/uufs"
within nextflow.
Also I'm just checking but the tmp default here would not be the issue then correct.
that default is overridden here: https://github.com/brentp/slivar/blob/master/src/slivar.nim#L97
you can make sure to export TMPDIR=/path/to/somewhere/big/
Yes, here is the bash script that runs the job, I added the echo to confirm.
#!/bin/bash -ue
export TMPDIR="/scratch/local/$USER/$SLURM_JOB_ID"
echo $TMPDIR
slivar expr --vcf my-start.vcf --pass-only -g /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip --info "INFO.gnomad311_AF < 0.02" --out-vcf my-end.vcf
And the result:
/scratch/local/ucgd-pepipeline/3888204
> slivar version: 0.2.3 d8fb9a077792f6e76384321b67c88ca47a3c1e4a
[slivar] 3 samples matched in VCF and PED to be evaluated
zipfiles.nim(54) open
Error: unhandled exception: Not a zip archive
error opening /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip [IOError]
I also confirmed the rw access to the zip file while on the node.
I know this is likely a out there question, but does the nim zip library have a linux based zip requirement of any kind? Only reason I ask is when I run zip
within the container I get the following:
FATAL: "zip": executable file not found in $PATH
no. that's included in the executable.
are you sure your full path resolves to something that's --bind
ed? or are there symlinks or something in the directory?
does it work with other slivar zip files?
does whatever user the container or workflow is running as have permission to read that file?
Yes the user who runs it has full permissions to the data, as a test I both ran stat on the file and used a different zip file, all within the container:
$> bash .command.run
~~> Path to TMPDIR
/scratch/local/ucgd-pepipeline/3891792
~~> stat on the file
File: `/scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip'
Size: 18317498651 Blocks: 35776408 IO Block: 4194304 regular file
Device: fe53f016h/4266913814d Inode: 144116733509189855 Links: 2
Access: (0660/-rw-rw----) Uid: (60332/ UNKNOWN) Gid: (4000386/proj_UCGD)
Access: 2021-06-18 13:07:19.000000000 -0600
Modify: 2021-04-30 00:42:29.000000000 -0600
Change: 2021-06-18 00:06:05.000000000 -0600
~~> Run with different zip file
$> slivar expr --vcf my-start.vcf --pass-only -g /scratch/ucgd/lustre/common/data/Slivar/db/gnomad.hg38.v2.zip --info "INFO.gnomad311_AF < 0.02" --out-vcf my-end.vcf
[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:gnomad311_AF for expression:INFO.gnomad311_AF < 0.02
[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:gnomad311_AF for expression:INFO.gnomad311_AF < 0.02
[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:gnomad311_AF for expression:INFO.gnomad311_AF < 0.02
...
ok. so it works on the standard slivar zip files (you just specified an unknown attribute).
There must be something different about how that zip was created. What was the command used to create the gnomad311.zip
zip?
My thoughts are aligning with you on the issue being how the file was created, I'll follow up on that and return the results here.
you didn't show the version of slivar used in the container. make sure it's the latest. (v0.2.3)
I am also able to run this:
slivar expr -g /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip -v /scratch/ucgd/lustre/common/data/GnomAD/r3.1.1/gnomad.genomes.v3.1.1.sites.chr1.vcf.bgz -o x.vcf
Yes, I'm running 0.2.3
maybe I'm building it differently in the container. I would try running:
wget https://github.com/brentp/slivar/releases/download/v0.2.3/slivar
chmod +x ./slivar
export PATH=.:$PATH
{slivar command}
in the container?
I created gnomad311.zip
using slivar v0.2.1 and the command line shown in Issue 90 (which worked once I changed my output directory).
I have been using gnomad311.zip
successfully on the interactive nodes. As Brent suggests, my Slivar script includes
export TMPDIR=/scratch/local
which gives it 1 Tb of tmp space. Note that gnomad311.zip
is much larger (17 Gb) than any of the other gnotate zip files we use (all <4 Gb).
@brentp I tested your build suggestion, but it had the same result. I've narrowed it down to only happening within the container, I've downloaded the newest version of slivar as a module and It ran on the node. So something with the container build is not allow it to work correctly.
@srynobio as you likely know you can use:
process {
clusterOptions = '--gres=tmpspace:85G '
}
to make sure you have sufficient TMP space. I'm quite sure that slivar respects the given TMPDIR.
@brentp I will run that option in our test environment and update this issue.
closing as resolved. please let me know if that's not the case.
I'm running into an issue where slivar can't read a gnomad zip file while running via a container when we launch to our HPC cluster environment.
Although if I run the same command (not container) on our interactive nodes it seems to work:
I'm not absolutely sure of the reason for this, but I think it could be related to how /tmp is defined here, because correct me if I'm wrong you open the zip file to /tmp for processing. If this is true, and given that our HPC cluster environment has atypical tmp directories (used TMPDIR="/scratch/local/$USER/$SLURM_JOB_ID) formating, would it be possible to add the ability to pass user defined tmp directories?