marbl / verkko

265 stars 27 forks source link

Issue Encountered at sqStoreCreate Step #223

Closed XingzhengLee closed 5 months ago

XingzhengLee commented 6 months ago

Hello,

I'm currently experiencing an issue during the initial phase of the workflow (sqStoreCreate) while running a local test.

Script:

verkko -d asm --hifi ./hifi.fastq.gz --nano ./ont.fastq.gz

snakemake.log

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 112
Rules claiming more threads will be scaled down.
Provided resources: mem_gb=64
Job stats:
job                            count
---------------------------  -------
buildGraph                         1
buildPackages                      1
buildStore                         1
combineConsensus                   1
combineONT                         1
combineOverlaps                    1
combineOverlapsConfigure           1
combineTipsONT                     1
configureFindErrors                1
configureOverlaps                  1
correctHiFi                        1
emptyfile                          1
extractONT                         1
fixErrors                          1
generateLayoutContigsInputs        1
getCoverage                        3
layoutContigs                      1
prepCoverage                       1
processGraph                       1
processONT                         1
splitONT                           1
splitTipsONT                       1
untip                              1
verkko                             1
total                             26

Select jobs to execute...

[Sat Jan  6 21:21:07 2024]
localrule emptyfile:
    output: emptyfile
    jobid: 16
    reason: Missing output files: emptyfile
    resources: tmpdir=/tmp

[Sat Jan  6 21:21:07 2024]
rule buildStore:
    input: /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz
    output: 0-correction/hifi.seqStore
    log: 0-correction/buildStore.err
    jobid: 7
    reason: Missing output files: 0-correction/hifi.seqStore
    resources: tmpdir=/tmp, job_id=1, n_cpus=1, mem_gb=4, time_h=4

[Sat Jan  6 21:21:07 2024]
checkpoint splitONT:
    input: /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./ont.fastq.gz
    output: 3-align/splitONT.finished
    log: 3-align/splitONT.err
    jobid: 13
    reason: Missing output files: 3-align/splitONT.finished
    resources: tmpdir=/tmp, job_id=1, n_cpus=1, mem_gb=8, time_h=96
DAG of jobs will be updated after completion.

[Sat Jan  6 21:21:07 2024]
Finished job 16.
1 of 26 steps (4%) done
[Sat Jan  6 21:21:07 2024]
Error in rule buildStore:
    jobid: 7
    input: /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz
    output: 0-correction/hifi.seqStore
    log: 0-correction/buildStore.err (check log file(s) for error details)
    shell:

cd 0-correction

cat > ./buildStore.sh <<EOF
#!/bin/sh
set -e

#  Construct a Canu seqStore for the HiFi reads.
#
echo "Building seqStore."
/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreCreate \\
  -o ../0-correction/hifi.seqStore \\
  -minlength 4000 \\
  -homopolycompress \\
  -pacbio-hifi hifi /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz

/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreDumpMetaData -stats -S ../0-correction/hifi.seqStore

# check that the store has no duplicate read names, correction only keeps the read name up to the first space
count=\`wc -l ../0-correction/hifi.seqStore/readNames.txt |awk '{print \$1}'\`
count_unique=\`cat ../0-correction/hifi.seqStore/readNames.txt|awk '{print \$2}' |sort |uniq |wc -l\`

if [ \$count -ne \$count_unique ]; then
   echo "Error, the input has duplicate IDs:"
   cat ../0-correction/hifi.seqStore/readNames.txt| awk '{print \$2}' | sort |uniq -c |awk '{if (\$1 > 1) print "   "\$0}'
   exit -1
fi

EOF

chmod +x ./buildStore.sh

./buildStore.sh > ../0-correction/buildStore.err 2>&1

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[Sat Jan  6 21:21:14 2024]
Finished job 13.
2 of 26 steps (8%) done
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-01-06T212106.712499.snakemake.log

The issue I'm encountering is that despite no problems executing the buildStore.sh script manually, within the workflow, the file buildStore.sh wasn't generated, and the associated buildStore.err file remained empty. This occurred within the Snakemake workflow, and I'm relatively new to this environment. Any guidance or suggestions on how to resolve this issue would be greatly appreciated.

PS:

Here is a snippet of the log information I received when executing the snakemake.sh script with Snakemake's debugging flags (--verbose).

Launching bioconda verkko bioconda 1.4.1
Using snakemake 7.32.4.
Building DAG of jobs...
Updating job combineONT.
Replace combineONT with dynamic branch combineONT
updating depending job splitTipsONT
updating depending job processONT
Updating job splitTipsONT.
Replace splitTipsONT with dynamic branch splitTipsONT
updating depending job combineTipsONT
Updating job extractONT.
Replace extractONT with dynamic branch extractONT
updating depending job buildPackages
Using shell: /usr/bin/bash
Provided cores: 112
Rules claiming more threads will be scaled down.
Provided resources: mem_gb=64
Job stats:
job                            count
---------------------------  -------
alignONT                           1
buildGraph                         1
buildPackages                      1
buildStore                         1
combineConsensus                   1
combineONT                         1
combineOverlaps                    1
combineOverlapsConfigure           1
combineTipsONT                     1
configureFindErrors                1
configureOverlaps                  1
correctHiFi                        1
extractONT                         1
fixErrors                          1
generateLayoutContigsInputs        1
getCoverage                        3
indexGraph                         1
layoutContigs                      1
prepCoverage                       1
processGraph                       1
processONT                         1
splitTipsONT                       1
untip                              1
verkko                             1
total                             26

Resources before job selection: {'mem_gb': 64, '_cores': 112, '_nodes': 9223372036854775807}
Ready jobs (1)
Select jobs to execute...
Using greedy selector because only single job has to be scheduled.
Selected jobs (1)
Resources after job selection: {'mem_gb': 60, '_cores': 111, '_nodes': 9223372036854775806}

[Sun Jan  7 09:19:10 2024]
rule buildStore:
    input: /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz
    output: 0-correction/hifi.seqStore
    log: 0-correction/buildStore.err
    jobid: 7
    reason: Missing output files: 0-correction/hifi.seqStore
    resources: tmpdir=/tmp, job_id=1, n_cpus=1, mem_gb=4, time_h=4

Full Traceback (most recent call last):
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2656, in run_wrapper
    run(
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/Snakefiles/c1-buildStore.sm", line 93, in __rule_buildStore
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/shell.py", line 294, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  
cd 0-correction

cat > ./buildStore.sh <<EOF
#!/bin/sh
set -e

#  Construct a Canu seqStore for the HiFi reads.
#
echo "Building seqStore."
/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreCreate \\
  -o ../0-correction/hifi.seqStore \\
  -minlength 4000 \\
  -homopolycompress \\
  -pacbio-hifi hifi /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz

/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreDumpMetaData -stats -S ../0-correction/hifi.seqStore

# check that the store has no duplicate read names, correction only keeps the read name up to the first space
count=\`wc -l ../0-correction/hifi.seqStore/readNames.txt |awk '{print \$1}'\`
count_unique=\`cat ../0-correction/hifi.seqStore/readNames.txt|awk '{print \$2}' |sort |uniq |wc -l\`

if [ \$count -ne \$count_unique ]; then
   echo "Error, the input has duplicate IDs:"
   cat ../0-correction/hifi.seqStore/readNames.txt| awk '{print \$2}' | sort |uniq -c |awk '{if (\$1 > 1) print "   "\$0}'
   exit -1
fi

EOF

chmod +x ./buildStore.sh

./buildStore.sh > ../0-correction/buildStore.err 2>&1' died with <Signals.SIGSEGV: 11>.

[Sun Jan  7 09:19:10 2024]
Error in rule buildStore:
    jobid: 7
    input: /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz
    output: 0-correction/hifi.seqStore
    log: 0-correction/buildStore.err (check log file(s) for error details)
    shell:

cd 0-correction

cat > ./buildStore.sh <<EOF
#!/bin/sh
set -e

#  Construct a Canu seqStore for the HiFi reads.
#
echo "Building seqStore."
/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreCreate \\
  -o ../0-correction/hifi.seqStore \\
  -minlength 4000 \\
  -homopolycompress \\
  -pacbio-hifi hifi /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz

/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreDumpMetaData -stats -S ../0-correction/hifi.seqStore

# check that the store has no duplicate read names, correction only keeps the read name up to the first space
count=\`wc -l ../0-correction/hifi.seqStore/readNames.txt |awk '{print \$1}'\`
count_unique=\`cat ../0-correction/hifi.seqStore/readNames.txt|awk '{print \$2}' |sort |uniq |wc -l\`

if [ \$count -ne \$count_unique ]; then
   echo "Error, the input has duplicate IDs:"
   cat ../0-correction/hifi.seqStore/readNames.txt| awk '{print \$2}' | sort |uniq -c |awk '{if (\$1 > 1) print "   "\$0}'
   exit -1
fi

EOF

chmod +x ./buildStore.sh

./buildStore.sh > ../0-correction/buildStore.err 2>&1

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Full Traceback (most recent call last):
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2656, in run_wrapper
    run(
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/Snakefiles/c1-buildStore.sm", line 93, in __rule_buildStore
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/shell.py", line 294, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  
cd 0-correction

cat > ./buildStore.sh <<EOF
#!/bin/sh
set -e

#  Construct a Canu seqStore for the HiFi reads.
#
echo "Building seqStore."
/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreCreate \\
  -o ../0-correction/hifi.seqStore \\
  -minlength 4000 \\
  -homopolycompress \\
  -pacbio-hifi hifi /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz

/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreDumpMetaData -stats -S ../0-correction/hifi.seqStore

# check that the store has no duplicate read names, correction only keeps the read name up to the first space
count=\`wc -l ../0-correction/hifi.seqStore/readNames.txt |awk '{print \$1}'\`
count_unique=\`cat ../0-correction/hifi.seqStore/readNames.txt|awk '{print \$2}' |sort |uniq |wc -l\`

if [ \$count -ne \$count_unique ]; then
   echo "Error, the input has duplicate IDs:"
   cat ../0-correction/hifi.seqStore/readNames.txt| awk '{print \$2}' | sort |uniq -c |awk '{if (\$1 > 1) print "   "\$0}'
   exit -1
fi

EOF

chmod +x ./buildStore.sh

./buildStore.sh > ../0-correction/buildStore.err 2>&1' died with <Signals.SIGSEGV: 11>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 715, in _callback
    raise ex
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 699, in cached_or_run
    run_func(*args)
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2692, in run_wrapper
    raise RuleException(
snakemake.exceptions.RuleException: CalledProcessError in file /home/lixingzheng/miniconda3/envs/verkko/lib/verkko/Snakefiles/c1-buildStore.sm, line 37:
Command 'set -euo pipefail;  
cd 0-correction

cat > ./buildStore.sh <<EOF
#!/bin/sh
set -e

#  Construct a Canu seqStore for the HiFi reads.
#
echo "Building seqStore."
/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreCreate \\
  -o ../0-correction/hifi.seqStore \\
  -minlength 4000 \\
  -homopolycompress \\
  -pacbio-hifi hifi /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz

/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreDumpMetaData -stats -S ../0-correction/hifi.seqStore

# check that the store has no duplicate read names, correction only keeps the read name up to the first space
count=\`wc -l ../0-correction/hifi.seqStore/readNames.txt |awk '{print \$1}'\`
count_unique=\`cat ../0-correction/hifi.seqStore/readNames.txt|awk '{print \$2}' |sort |uniq |wc -l\`

if [ \$count -ne \$count_unique ]; then
   echo "Error, the input has duplicate IDs:"
   cat ../0-correction/hifi.seqStore/readNames.txt| awk '{print \$2}' | sort |uniq -c |awk '{if (\$1 > 1) print "   "\$0}'
   exit -1
fi

EOF

chmod +x ./buildStore.sh

./buildStore.sh > ../0-correction/buildStore.err 2>&1' died with <Signals.SIGSEGV: 11>.
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/Snakefiles/c1-buildStore.sm", line 37, in __rule_buildStore

RuleException:
CalledProcessError in file /home/lixingzheng/miniconda3/envs/verkko/lib/verkko/Snakefiles/c1-buildStore.sm, line 37:
Command 'set -euo pipefail;  
cd 0-correction

cat > ./buildStore.sh <<EOF
#!/bin/sh
set -e

#  Construct a Canu seqStore for the HiFi reads.
#
echo "Building seqStore."
/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreCreate \\
  -o ../0-correction/hifi.seqStore \\
  -minlength 4000 \\
  -homopolycompress \\
  -pacbio-hifi hifi /home/lixingzheng/project/trio_assembly/04.primary_assembly/test/./hifi.fastq.gz

/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/bin/sqStoreDumpMetaData -stats -S ../0-correction/hifi.seqStore

# check that the store has no duplicate read names, correction only keeps the read name up to the first space
count=\`wc -l ../0-correction/hifi.seqStore/readNames.txt |awk '{print \$1}'\`
count_unique=\`cat ../0-correction/hifi.seqStore/readNames.txt|awk '{print \$2}' |sort |uniq |wc -l\`

if [ \$count -ne \$count_unique ]; then
   echo "Error, the input has duplicate IDs:"
   cat ../0-correction/hifi.seqStore/readNames.txt| awk '{print \$2}' | sort |uniq -c |awk '{if (\$1 > 1) print "   "\$0}'
   exit -1
fi

EOF

chmod +x ./buildStore.sh

./buildStore.sh > ../0-correction/buildStore.err 2>&1' died with <Signals.SIGSEGV: 11>.
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/verkko/Snakefiles/c1-buildStore.sm", line 37, in __rule_buildStore
  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/concurrent/futures/thread.py", line 58, in run
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-01-07T091910.393877.snakemake.log
unlocking
removing lock
removing lock
removed all locks

Thank you for your assistance.

skoren commented 6 months ago

This looks like an error in snakemake itself, the command that's failing isn't actually in verkko. The traceback says:

  File "/home/lixingzheng/miniconda3/envs/verkko/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2692, in run_wrapper
    raise RuleException(
Command 'set -euo pipefail;  

I expect any shell snakemake command would have failed, this just happened to be the first to run. When snakemake fails itself, it doesn't generate the corresponding shell script (or removes it upon failure) which is why you don't have the buildScript.sh in your folder. This is a bashism and I suspect your default shell is dash or something else (if you're on Ubuntu). If you make your default shell bash before running verkko it should work, you might also be able to change the verkko script from

echo  > ${outd}/snakemake.sh "#!/bin/sh"

to

echo  > ${outd}/snakemake.sh "#!/usr/bin/env bash"

which will always force snakemake to run in bash.

XingzhengLee commented 5 months ago

Thank you for your detailed guidance! However, it seems that the proposed solution hasn't resolved the issue.

Regarding the system information:

Kernel version:

Linux version 3.10.0-1160.105.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Thu Dec 7 15:39:45 UTC 2023

Default shell:

lrwxrwxrwx. 1 root root 4 7月 13 18:05 /bin/sh -> bash

If resolving this particular Snakemake issue turns out to be intricate or challenging, I'm inclined to explore an alternative approach. Is it possible to circumvent the workflow and manually execute the verkko steps one by one? If this is feasible, I'd greatly appreciate any guidance on how to proceed with the manual execution of verkko's steps.

While I understand that solving this directly through Snakemake would save considerable time, having a manual step-by-step process as a backup could provide valuable insights and potentially expedite the resolution. Your assistance in this matter is truly appreciated!

skoren commented 5 months ago

It's not feasible to run the pipeline w/o snakemake. Something in your snakemake/conda environment is broken since pipefail should be a supported option yet is failing. I would suggest trying a clean conda environment or installing snakemake/verkko locally w/o conda with its dependencies.

XingzhengLee commented 5 months ago

It appears that the issue might stem from a compatibility mismatch between the server I'm currently using and Snakemake. I attempted to build verkko from the source, yet encountered the same error. Intriguingly, when I ran the pipeline using my personal laptop with the sample data, it progressed without any errors. However, given the limitations of my laptop's resources, it might not be sufficient for handling real data. As a result, I'm actively seeking an alternative server that might better accommodate the computational requirements.

Thank you sincerely for your patience and invaluable assistance throughout this process!