brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
248 stars 23 forks source link

Issue using zip file with slivar docker #94

Closed srynobio closed 2 years ago

srynobio commented 3 years ago

I'm running into an issue where slivar can't read a gnomad zip file while running via a container when we launch to our HPC cluster environment.

> slivar version: 0.2.3 d8fb9a077792f6e76384321b67c88ca47a3c1e4a
[slivar] 3 samples matched in VCF and PED to be evaluated
zipfiles.nim(54)         open
Error: unhandled exception: Not a zip archive
error opening /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip [IOError]

Although if I run the same command (not container) on our interactive nodes it seems to work:

[slivar] 3 samples matched in VCF and PED to be evaluated
[slivar] message for /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip:
   > created on:2021-04-30
[slivar] 10000 chr1:3879101 evaluated 10000 variants in 31.3 seconds (319.1/second)
[slivar] 20000 chr1:7823399 evaluated 10000 variants in 1.3 seconds (7728.4/second)
[slivar] 100000 chr1:41830473 evaluated 100000 variants in 12.7 seconds (7848.8/second)
[slivar] 200000 chr1:88362688 evaluated 100000 variants in 16.7 seconds (5988.6/second)
[slivar] 300000 chr1:123977991 evaluated 100000 variants in 12.2 seconds (8164.7/second)
[slivar] 400000 chr1:180526491 evaluated 100000 variants in 14.3 seconds (7008.5/second)
[slivar] 500000 chr1:224045487 evaluated 100000 variants in 14.2 seconds (7061.1/second

I'm not absolutely sure of the reason for this, but I think it could be related to how /tmp is defined here, because correct me if I'm wrong you open the zip file to /tmp for processing. If this is true, and given that our HPC cluster environment has atypical tmp directories (used TMPDIR="/scratch/local/$USER/$SLURM_JOB_ID) formating, would it be possible to add the ability to pass user defined tmp directories?

brentp commented 3 years ago

hi Shawn, just to verify, you have --bind /scratch or something similar in your singularity call? I fixed a problem with TMPDIR in 0544428 but that was in v0.2.2 so you should have it. And you would have seen an error about that, I think. If you export TMPDIR, then slivar will use it.

srynobio commented 3 years ago

Yes I pass runOptions ="-B /scratch/:/scratch -B /uufs/:/uufs" within nextflow.

Also I'm just checking but the tmp default here would not be the issue then correct.

brentp commented 3 years ago

that default is overridden here: https://github.com/brentp/slivar/blob/master/src/slivar.nim#L97 you can make sure to export TMPDIR=/path/to/somewhere/big/

srynobio commented 3 years ago

Yes, here is the bash script that runs the job, I added the echo to confirm.

#!/bin/bash -ue
export TMPDIR="/scratch/local/$USER/$SLURM_JOB_ID"
echo $TMPDIR

slivar expr --vcf my-start.vcf --pass-only -g /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip --info "INFO.gnomad311_AF < 0.02" --out-vcf my-end.vcf

And the result:

/scratch/local/ucgd-pepipeline/3888204
> slivar version: 0.2.3 d8fb9a077792f6e76384321b67c88ca47a3c1e4a
[slivar] 3 samples matched in VCF and PED to be evaluated
zipfiles.nim(54)         open
Error: unhandled exception: Not a zip archive
error opening /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip [IOError]

I also confirmed the rw access to the zip file while on the node.

srynobio commented 3 years ago

I know this is likely a out there question, but does the nim zip library have a linux based zip requirement of any kind? Only reason I ask is when I run zip within the container I get the following:

FATAL: "zip": executable file not found in $PATH

brentp commented 3 years ago

no. that's included in the executable. are you sure your full path resolves to something that's --binded? or are there symlinks or something in the directory?

does it work with other slivar zip files?

does whatever user the container or workflow is running as have permission to read that file?

srynobio commented 3 years ago

Yes the user who runs it has full permissions to the data, as a test I both ran stat on the file and used a different zip file, all within the container:

$> bash .command.run
~~> Path to TMPDIR
/scratch/local/ucgd-pepipeline/3891792

~~> stat on the file
  File: `/scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip'
  Size: 18317498651 Blocks: 35776408   IO Block: 4194304 regular file
Device: fe53f016h/4266913814d   Inode: 144116733509189855  Links: 2
Access: (0660/-rw-rw----)  Uid: (60332/ UNKNOWN)   Gid: (4000386/proj_UCGD)
Access: 2021-06-18 13:07:19.000000000 -0600
Modify: 2021-04-30 00:42:29.000000000 -0600
Change: 2021-06-18 00:06:05.000000000 -0600

~~> Run with different zip file
$> slivar expr --vcf my-start.vcf --pass-only -g /scratch/ucgd/lustre/common/data/Slivar/db/gnomad.hg38.v2.zip --info "INFO.gnomad311_AF < 0.02" --out-vcf my-end.vcf

[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:gnomad311_AF for expression:INFO.gnomad311_AF < 0.02

[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:gnomad311_AF for expression:INFO.gnomad311_AF < 0.02

[slivar] error evaluating info expression (this can happen if a field is missing):
error from duktape: unknown attribute:gnomad311_AF for expression:INFO.gnomad311_AF < 0.02

...
brentp commented 3 years ago

ok. so it works on the standard slivar zip files (you just specified an unknown attribute). There must be something different about how that zip was created. What was the command used to create the gnomad311.zip zip?

srynobio commented 3 years ago

My thoughts are aligning with you on the issue being how the file was created, I'll follow up on that and return the results here.

brentp commented 3 years ago

you didn't show the version of slivar used in the container. make sure it's the latest. (v0.2.3)

I am also able to run this:

slivar expr -g /scratch/ucgd/lustre/common/data/Slivar/db/gnomad311.zip -v /scratch/ucgd/lustre/common/data/GnomAD/r3.1.1/gnomad.genomes.v3.1.1.sites.chr1.vcf.bgz -o x.vcf
srynobio commented 3 years ago

Yes, I'm running 0.2.3

brentp commented 3 years ago

maybe I'm building it differently in the container. I would try running:

wget https://github.com/brentp/slivar/releases/download/v0.2.3/slivar
chmod +x ./slivar
export PATH=.:$PATH
{slivar command}

in the container?

seboyden commented 3 years ago

I created gnomad311.zip using slivar v0.2.1 and the command line shown in Issue 90 (which worked once I changed my output directory).

I have been using gnomad311.zip successfully on the interactive nodes. As Brent suggests, my Slivar script includes

export TMPDIR=/scratch/local

which gives it 1 Tb of tmp space. Note that gnomad311.zip is much larger (17 Gb) than any of the other gnotate zip files we use (all <4 Gb).

srynobio commented 3 years ago

@brentp I tested your build suggestion, but it had the same result. I've narrowed it down to only happening within the container, I've downloaded the newest version of slivar as a module and It ran on the node. So something with the container build is not allow it to work correctly.

brentp commented 3 years ago

@srynobio as you likely know you can use:

process {
               clusterOptions = '--gres=tmpspace:85G '
}

to make sure you have sufficient TMP space. I'm quite sure that slivar respects the given TMPDIR.

srynobio commented 3 years ago

@brentp I will run that option in our test environment and update this issue.

brentp commented 2 years ago

closing as resolved. please let me know if that's not the case.