dahak-metagenomics / dahak

benchmarking and containerization of tools for analysis of complex non-clinical metagenomes.
https://dahak-metagenomics.github.io/dahak
BSD 3-Clause "New" or "Revised" License
21 stars 4 forks source link

running quickstart tutorial fails: squashfs-tools not available #102

Open stephenturner opened 6 years ago

stephenturner commented 6 years ago

Trying to run the read filtering workflow at https://dahak-metagenomics.github.io/dahak/quickstart/#read-filtering

After setting up the environment as noted in #99 and issuing the following command (with --use-singularity):

snakemake --use-singularity -p \
        --configfile=config/custom_readfilt_workflow.json \
        read_filtering_pretrim_workflow

I get:

Building DAG of jobs...
Pulling singularity image docker://quay.io/biocontainers/fastqc:0.11.7--pl5.22.0_2.
WorkflowError:
Failed to pull singularity image from docker://quay.io/biocontainers/fastqc:0.11.7--pl5.22.0_2:
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
ERROR: You must install squashfs-tools to build images
ABORT: Aborting with RETVAL=255
ERROR: pulling container failed!
$ singularity --version
2.4.2-dist

Please provide further installation/setup instructions.

brooksph commented 6 years ago

Thanks @stephenturner. This seems to be related to https://github.com/singularityware/singularity/issues/1078. Does installing squashfs-tools resolve the issue?

charlesreid1 commented 6 years ago

Just skimmed through that issue, it's not clear how to install squashfs-tools...

@stephenturner is this on Ravenna/HPC or your local machine?

brooksph commented 6 years ago

Gotcha, on ubuntu sudo apt-get install squashfs-tools. Not sure if it's possible via conda.

stephenturner commented 6 years ago

Local. Centos7.

On Wed, Jul 11, 2018 at 2:14 PM Phillip Brooks notifications@github.com wrote:

Gotcha, on ubuntu sudo apt-get install squashfs-tools.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dahak-metagenomics/dahak/issues/102#issuecomment-404262676, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcFLJaRCJyZ5ShcePrfKHzt1pNVyThwks5uFkBzgaJpZM4VLDaB .

stephenturner commented 6 years ago

Installed with sudo yum install squashfs-tools, but after that:

Job 1: --- Pre-trim quality check of trimmed data with fastqc.

fastqc -t 1 //data/SRR606249_subset10_1_reads.fq.gz
/data/SRR606249_subset10_2_reads.fq.gz -o /data
Activating singularity image
/nv/vol184/uvabx/projects/dahak/dahak/workflows/.snakemake/singularity/f1d03c0a142609dc68fd5a6943abcaad.simg
ERROR  : Failed invoking the NEWUSER namespace runtime: Invalid argument
ABORT  : Retval = 255
    Error in rule pre_trimming_quality_assessment:
        jobid: 1
        output: data/SRR606249_subset10_1_reads_fastqc.zip,
data/SRR606249_subset10_2_reads_fastqc.zip

RuleException:
CalledProcessError in line 162 of
/nv/vol184/uvabx/projects/dahak/dahak/workflows/read_filtering/Snakefile:
Command 'singularity exec --home
/nv/vol184/uvabx/projects/dahak/dahak/workflows
/nv/vol184/uvabx/projects/dahak/dahak/workflows/.snakemake/singularity/f1d03c0a142609dc68fd5a6943abcaad.simg
bash -c " set -euo pipefail;  fastqc -t 1
//data/SRR606249_subset10_1_reads.fq.gz
/data/SRR606249_subset10_2_reads.fq.gz -o /data "' returned non-zero exit
status 255.
  File
"/nv/vol184/uvabx/projects/dahak/dahak/workflows/read_filtering/Snakefile",
line 162, in __rule_pre_trimming_quality_assessment
  File
"/home/sdt5z/miniconda3/envs/dahak/lib/python3.6/concurrent/futures/thread.py",
line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log:
/nv/vol184/uvabx/projects/dahak/dahak/workflows/.snakemake/log/2018-07-11T161047.251072.snakemake.log

On Wed, Jul 11, 2018 at 4:09 PM Stephen Turner vustephen@gmail.com wrote:

Local. Centos7.

On Wed, Jul 11, 2018 at 2:14 PM Phillip Brooks notifications@github.com wrote:

Gotcha, on ubuntu sudo apt-get install squashfs-tools.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dahak-metagenomics/dahak/issues/102#issuecomment-404262676, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcFLJaRCJyZ5ShcePrfKHzt1pNVyThwks5uFkBzgaJpZM4VLDaB .

charlesreid1 commented 6 years ago

This issue was raised in the singularity repo here, problem was with RedHat. It sounds like a problem with Linux and Singularity not playing nicely together.

I am currently working on reproducing this error on an AWS node running CentOS....

kternus commented 6 years ago

We've previously had trouble getting that FastQC docker to run on Red Hat, and it may be a Fedora compatibility issue. Is it possible for @stephenturner to skip that step and test the next step in the workflow to narrow down whether it's an issue with the FastQC container or something else?

stephenturner commented 6 years ago

RHEL/Fedora/CentOS should all be the same, essentially. The assembly quickstart (https://dahak-metagenomics.github.io/dahak/quickstart/#assembly) has the same issue as noted above. After creating the json file, and running:

snakemake --use-singularity -p --configfile=config/custom_assembly_workflow.json assembly_workflow_all

Get the same NEWUSER/namespace problem:

Building DAG of jobs...
Pulling singularity image docker://quay.io/biocontainers/spades:3.11.1--py27_zlib1.2.8_0.
Pulling singularity image docker://quay.io/biocontainers/megahit:1.1.2--py35_0.
Pulling singularity image docker://quay.io/biocontainers/trimmomatic:0.36--5.
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        4       assembly_megahit
        4       assembly_metaspades
        1       assembly_workflow_all
        1       download_read_adapters
        1       download_reads
        4       quality_trimming
        15

Job 15: --- Downloading adapter file.

wget -O data/TruSeq2-PE.fa http://dib-training.ucdavis.edu.s3.amazonaws.com/mRNAseq-semi-2015-03-04/TruSeq2-PE.fa
--2018-07-16 10:59:22--  http://dib-training.ucdavis.edu.s3.amazonaws.com/mRNAseq-semi-2015-03-04/TruSeq2-PE.fa
Resolving dib-training.ucdavis.edu.s3.amazonaws.com (dib-training.ucdavis.edu.s3.amazonaws.com)... 54.231.236.43
Connecting to dib-training.ucdavis.edu.s3.amazonaws.com (dib-training.ucdavis.edu.s3.amazonaws.com)|54.231.236.43|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 539 [binary/octet-stream]
Saving to: ‘data/TruSeq2-PE.fa’

100%[====================================================================>] 539         --.-K/s   in 0s

2018-07-16 10:59:22 (63.5 MB/s) - ‘data/TruSeq2-PE.fa’ saved [539/539]

Finished job 15.
1 of 15 steps (7%) done

Job 10: --- Quality trimming read data.

trimmomatic PE /data/SRR606249_subset10_1_reads.fq.gz /data/SRR606249_subset10_2_reads.fq.gz /data/SRR606249_subset10_1.trim30.fq.gz /data/SRR606249_subset10_1.trim30_se /data/SRR606249_subset10_2.trim30.fq.gz /data/SRR606249_subset10_2.trim30_se ILLUMINACLIP:/data/TruSeq2-PE.fa:2:40:15 LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 MINLEN:25
Activating singularity image /path/dahak/workflows/.snakemake/singularity/7a3439791f532ae9738cd41931466141.simg
ERROR  : Failed invoking the NEWUSER namespace runtime: Invalid argument
ABORT  : Retval = 255
    Error in rule quality_trimming:
        jobid: 10
        output: data/SRR606249_subset10_1.trim30.fq.gz, data/SRR606249_subset10_1.trim30_se, data/SRR606249_subset10_2.trim30.fq.gz, data/SRR606249_subset10_2.trim30_se
        log: data/trimmomatic_pe_SRR606249_subset10_trim30.log

RuleException:
CalledProcessError in line 405 of /path/dahak/workflows/read_filtering/Snakefile:
Command 'singularity exec --home /path/dahak/workflows  /path/dahak/workflows/.snakemake/singularity/7a3439791f532ae9738cd41931466141.simg bash -c " set -euo pipefail;  trimmomatic PE /data/SRR606249_subset10_1_reads.fq.gz /data/SRR606249_subset10_2_reads.fq.gz /data/SRR606249_subset10_1.trim30.fq.gz /data/SRR606249_subset10_1.trim30_se /data/SRR606249_subset10_2.trim30.fq.gz /data/SRR606249_subset10_2.trim30_se ILLUMINACLIP:/data/TruSeq2-PE.fa:2:40:15 LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 MINLEN:25 "' returned non-zero exit status 255.
  File "/path/dahak/workflows/read_filtering/Snakefile", line 405, in __rule_quality_trimming
  File "/path/miniconda3/envs/dahak/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /path/dahak/workflows/.snakemake/log/2018-07-16T105901.667076.snakemake.log
kternus commented 6 years ago

Thanks for sharing that! I was going to ask @nalbright try different workflow steps on Red Hat, but it looks like that wouldn't be helpful based on your experience here. We can setup an Ubuntu OS for testing, and I'll have her focus on that for now.

ctb commented 6 years ago

Hi @kternus, yes, this is a generic problem with RedHat and Singularity.

cgrahlm commented 6 years ago

@stephenturner I was able to get this to run in a CentOS 7 VM. Are you able to run singularity on a test image like so: singularity --debug run shub://GodloveD/lolcow I googled around some more and there could be a few problems regarding the setuid bit and installing on clusters with RHEL.

This has some resources we may been to reference: https://groups.google.com/a/lbl.gov/forum/#!topic/singularity/W1lFqknaBLg

stephenturner commented 6 years ago

I wasn't able to get singularity --debug run shub://GodloveD/lolcow to run. Same issue. I'm going to compile from source and install as root, see if that helps.

stephenturner commented 6 years ago

So, I did that, and I'm getting past the NEWUSER error but hitting a wall with

DEBUG   [U=543223,P=21023] singularity_priv_escalate()               Temporarily escalating privileges (U=543223)
DEBUG   [U=0,P=21023]      singularity_priv_escalate()               Clearing supplementary GIDs.
DEBUG   [U=0,P=21023]      singularity_image_bind()                  Setting loop device flags
DEBUG   [U=0,P=21023]      singularity_priv_drop()                   Dropping privileges to UID=543223, GID=100 (5 supplementary GIDs)
DEBUG   [U=0,P=21023]      singularity_priv_drop()                   Restoring supplementary groups
DEBUG   [U=543223,P=21023] singularity_priv_drop()                   Confirming we have correct UID/GID
VERBOSE [U=543223,P=21023] singularity_image_bind()                  Using loop device: /dev/loop0
VERBOSE [U=543223,P=21023] singularity_image_squashfs_mount()        Mounting squashfs image: /dev/loop0 -> /usr/local/var/singularity/mnt/container
ERROR   [U=543223,P=21023] singularity_image_squashfs_mount()        Failed to mount squashfs image in (read only): No such device
ABORT   [U=543223,P=21023] singularity_image_squashfs_mount()        Retval = 255

I googled around a bit and landed on a few issues on the singularity repo, but nothing I could grok.

I do have squashfs-tools installed.

stephenturner commented 6 years ago

Should also add

$ singularity --version
2.5.2-dist
cgrahlm commented 6 years ago

When I was getting the squashfs errors on centos I had to rerun the configure command after I installed the package. But it sounds like you have installed the package then run the configure command.

From: Stephen Turner notifications@github.com Sent: Wednesday, July 18, 2018 2:57 PM To: dahak-metagenomics/dahak dahak@noreply.github.com Cc: Grahlmann, Chris cgrahlmann@signaturescience.com; Comment comment@noreply.github.com Subject: Re: [dahak-metagenomics/dahak] running quickstart tutorial fails: squashfs-tools not available (#102)

Should also add

$ singularity --version

2.5.2-dist

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dahak-metagenomics/dahak/issues/102#issuecomment-406054551, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlUSt7sCDAObun-m-lV8Y99WVfgyHRbNks5uH5LxgaJpZM4VLDaB.

stephenturner commented 6 years ago

Precisely

Stephen

Sent from mobile.

On Jul 18, 2018, at 4:26 PM, cgrahlma notifications@github.com wrote:

When I was getting the squashfs errors on centos I had to rerun the configure command after I installed the package. But it sounds like you have installed the package then run the configure command.

From: Stephen Turner notifications@github.com Sent: Wednesday, July 18, 2018 2:57 PM To: dahak-metagenomics/dahak dahak@noreply.github.com Cc: Grahlmann, Chris cgrahlmann@signaturescience.com; Comment comment@noreply.github.com Subject: Re: [dahak-metagenomics/dahak] running quickstart tutorial fails: squashfs-tools not available (#102)

Should also add

$ singularity --version

2.5.2-dist

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dahak-metagenomics/dahak/issues/102#issuecomment-406054551, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlUSt7sCDAObun-m-lV8Y99WVfgyHRbNks5uH5LxgaJpZM4VLDaB. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.