biocorecrg / MOP2

Master of Pores 2
https://biocorecrg.github.io/MOP2/docs/
MIT License
22 stars 8 forks source link

Error with Guppy Basecall: cat: 'fast5_fail---7_out/*.fastq': No such file or directory #5

Closed Whatsacb closed 1 year ago

Whatsacb commented 2 years ago

Hello,

I'm running MOP2 on a Macintosh computer with Nextflow 21.10.6 build 5660 and Guppy 6.0.1. MOP2 will run if I start it with the fastq files, but if I try to run basecalling with fast5 files it fails. I get the following error:

executor >  local (3)
[e4/54a1e0] process > flow1:GUPPY_BASECALL:baseCall (fast5_fail---2)            [  0%] 1 of 329, failed: 1
[-        ] process > flow1:NANOQ_FILTER:filter                                 -
[-        ] process > preprocess_flow:MinIONQC                                  -
[-        ] process > preprocess_flow:GRAPHMAP2:map                             -
[-        ] process > preprocess_flow:SAMTOOLS_CAT:catAln                       -
[-        ] process > preprocess_flow:SAMTOOLS_SORT:sortAln                     -
[-        ] process > preprocess_flow:SAMTOOLS_INDEX:indexBam                   -
[c8/cc31ee] process > preprocess_flow:checkRef (Checking GRCh38.primary_asse... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram                                  -
[-        ] process > preprocess_flow:bam2stats                                 -
[-        ] process > preprocess_flow:joinAlnStats                              -
[-        ] process > preprocess_flow:NANOPLOT_QC:MOP_nanoPlot                  -
[-        ] process > preprocess_flow:concatenateFastQFiles                     -
[-        ] process > preprocess_flow:FASTQC:fastQC                             -
[-        ] process > preprocess_flow:MULTIQC:makeReport                        -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_fail---7)

Caused by:
  Process 'flow1:GUPPY_BASECALL:baseCall (fast5_fail---7)' terminated with an error exit status (1)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./fast5_fail---7_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1        --num_callers  8
  cat fast5_fail---7_out/*.fastq >> fast5_fail---7.fastq
  rm fast5_fail---7_out/*.fastq
  gzip fast5_fail---7.fastq

Command exit status:
  1

Command output:
  ONT Guppy basecalling software version 6.0.1+652ffd179
  config file:        /Users/tj/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /Users/tj/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_fail---7_out
  chunk size:         2000
  chunks per runner:  512
  minimum qscore:     7
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 202 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|
  ***************************************************
  Caller time: 231 ms, Samples called: 0, samples/s: 0
  Finishing up any open output files.
  Basecalling completed successfully.

Command error:
  cat: 'fast5_fail---7_out/*.fastq': No such file or directory

Work dir:
  /Users/tj/MOP2/mop_preprocess/work/be/896b2b971a64d7483d9121934582ac

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named '.command.sh'

My params.config file is as follows:

params {
    conffile            = "/Users/tj/MOP2/data/U87_WT_122021/final_summary_FAR91556_a92bf75a.txt"
    fast5               = "/Users/tj/MOP2/data/U87_WT_122021/**/*.fast5"
    fastq               = ""

    reference           = "/Users/tj/MOP2/references/GRCh38.primary_assembly.genome.fa"
    annotation          = "/Users/tj/MOP2/anno/gencode.v35.annotation.gtf"
    ref_type            = "genome"

    pars_tools          = "drna_tool_splice_opt.tsv" 
    output              = "/Users/tj/MOP2/output/preprocess/U87_WT_122021"
    qualityqc           = 1
    granularity         = 1

    basecalling         = "guppy"
    GPU                 = "OFF"
    demultiplexing      = "NO"
    demulti_fast5       = "NO" 

    filtering           = "nanoq"

    mapping             = "minimap2"
    counting            = "NO"
    discovery           = "NO"

    cram_conv           = "YES"
    subsampling_cram    = 50

    saveSpace           = "NO"

    email               = ""
}

I notice that my error is similar to the one reported here, and so I think Guppy is looking for files in the wrong location? I tried troubleshooting on my own and doing an intensive google search, but I am stuck.

Thank you so much!

lucacozzuto commented 2 years ago

Hi, Guppy 6 by default separates good and bad reads. This has to be disabled when doing dRNAseq analyses. I added a new opt tsv file named drna_tool_unsplice_guppy6_opt.txt. You can pass it via pars_tools = "drna_tool_unsplice_guppy6_opt.txt" in params.config.

Best,

Whatsacb commented 2 years ago

Excellent, thank you for that, and thanks for such a quick reply!

It seems to be running now with no issues, although you may want to change that text file to a .tsv file if anyone in the future needs it. If I run into any further snags I will let you know. Thank you so much!

lucacozzuto commented 2 years ago

Yes, you are right!

Whatsacb commented 2 years ago

Okay, now I got the following error:

Error executing process > 'flow1:NANOQ_FILTER:filter (fast5_pass---3)'

Caused by:
  Process `flow1:NANOQ_FILTER:filter (fast5_pass---3)` terminated with an error exit status (127)

Command executed:

  nanoq -i fast5_pass---3.fastq.gz  -O g -o fast5_pass---3-filt.fastq.gz

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.sh: line 2: nanoq: command not found

Work dir:
  /home/tj/MOP2/mop_preprocess/work/80/1094eff2533e6b1db38a3de035e14f

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Is this related or something completely different? Would it be better to try using an older version of Guppy?

lucacozzuto commented 2 years ago

which container are you using?

Whatsacb commented 2 years ago

I am using docker, although now it seems I'm getting the same error as before:

executor >  local (5)
[95/4bf590] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 1 of 327, failed: 1
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[-        ] process > preprocess_flow:checkRef (C... -
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_pass---18)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_pass---18)` terminated with an error exit status (1)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./         --save_path ./fast5_pass---18_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1    --num_callers  8
  cat fast5_pass---18_out/*.fastq >> fast5_pass---18.fastq
  rm fast5_pass---18_out/*.fastq
  gzip fast5_pass---18.fastq

Command exit status:
  1

Command output:
  ONT Guppy basecalling software version 6.0.1+652ffd1
  config file:        /home/tj/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /home/tj/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_pass---18_out
  chunk size:         2000
  chunks per runner:  512
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1
  Found 1 fast5 files to process.
  Init time: 281 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|
  ***************************************************
  Caller time: 396 ms, Samples called: 0, samples/s: 0
  Finishing up any open output files.
  Basecalling completed successfully.

Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  cat: 'fast5_pass---18_out/*.fastq': No such file or directory

Work dir:
  /home/tj/MOP2/mop_preprocess/work/f1/93cc270690d5996e1a944719e2dad3

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

The command I tried to run was:

sudo nextflow run mop_preprocess.nf -with-docker > log.txt

lucacozzuto commented 2 years ago

check what is inside /home/tj/MOP2/mop_preprocess/work/f1/93cc270690d5996e1a944719e2dad3 you should have fastq files somewhere. It is quite strange

Whatsacb commented 2 years ago

This is what I see in that folder:

(folder) fast5_pass--18_out FAR91556_pass_a92bbf75a_269.fast5 fast5_pass--18.fastq

Inside the fast5_pass--18_out folder I see:

(folder) workspace guppy_basecaller_log-2022-02-24_22-16-13.log sequencing_summary.txt sequencing_telemetry.js

and one fast5 file in the workspace folder.

lucacozzuto commented 2 years ago

so the fast5_pass--18.fastq is there... I don't get what is happening... Can you try again? If it does not work I'll try tomorrow to test more on Guppy6, sorry

Whatsacb commented 2 years ago

Sure thing. I think it is looking for the fastq file in the fast5_pass--18_out folder, but it is not in that folder (it is in the parent folder)?

Again, thank you for your help. No sweat if you can't get to it right now, you already looked at this way faster than I expected. Whenever your schedule permits.

lucacozzuto commented 2 years ago

Hi, I just tested the test dataset on my mac with Docker. Everything worked. So, I'm wondering if something else is happening..

Whatsacb commented 2 years ago

What version of guppy are you using? I reset the params.config file to the default settings and tried running it on the test dataset, rather than my dataset from the Nanopore. This is the result:

Params.config:

params {
    conffile            = "/Users/tj/MOP2/mop_preprocess/final_summary_01.txt"
    fast5               = "/Users/tj/MOP2/data/wt/*.fast5"
    fastq               = ""

    reference           = "/Users/tj/MOP2/references/yeast_rRNA_ref.fa.gz"
    annotation          = ""
    ref_type            = "transcriptome"

    pars_tools          = "drna_tool_splice_opt.tsv"
    output              = "/Users/tj/MOP2/output/preprocess/test"
    qualityqc           = 5
    granularity         = 1

    basecalling         = "guppy"
    GPU                 = "OFF"
    demultiplexing      = "NO"
    demulti_fast5       = "NO" 

    filtering           = "nanoq"

    mapping             = "graphmap"
    counting            = "nanocount"
    discovery           = "NO"

    cram_conv           = "YES"
    subsampling_cram    = 50

    saveSpace           = "NO"

    email               = ""
}

Result (from running sudo nextflow run mop_preprocess.nf -with-docker > log.txt):

Pipeline BIOCORE@CRG Master of Pore - preprocess completed!
Started at  2022-02-26T14:15:01.927-08:00
Finished at 2022-02-26T14:15:16.669-08:00
executor >  local (2)
[9c/5ec784] process > flow1:GUPPY_BASECALL:baseCall (wt---1)       [100%] 1 of 1, failed: 1 ✘
[-        ] process > flow1:NANOQ_FILTER:filter                    -
[-        ] process > preprocess_flow:MinIONQC                     -
[-        ] process > preprocess_flow:RNA2DNA                      -
[-        ] process > preprocess_flow:GRAPHMAP:map                 -
[-        ] process > preprocess_flow:SAMTOOLS_CAT:catAln          -
[-        ] process > preprocess_flow:SAMTOOLS_SORT:sortAln        -
[-        ] process > preprocess_flow:SAMTOOLS_INDEX:indexBam      -
[9a/bbaf81] process > preprocess_flow:checkRef (Checking yeast_... [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram                     -
[-        ] process > preprocess_flow:bam2stats                    -
[-        ] process > preprocess_flow:joinAlnStats                 -
[-        ] process > preprocess_flow:NANOPLOT_QC:MOP_nanoPlot     -
[-        ] process > preprocess_flow:concatenateFastQFiles        -
[-        ] process > preprocess_flow:FASTQC:fastQC                -
[-        ] process > preprocess_flow:NANOCOUNT:nanoCount          -
[-        ] process > preprocess_flow:AssignReads                  -
[-        ] process > preprocess_flow:countStats                   -
[-        ] process > preprocess_flow:joinCountStats               -
[-        ] process > preprocess_flow:MULTIQC:makeReport           -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (wt---1)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (wt---1)` terminated with an error exit status (125)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./wt---1_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1        --num_callers  8
  cat wt---1_out/*.fastq >> wt---1.fastq
  rm wt---1_out/*.fastq
  gzip wt---1.fastq

Command exit status:
  125

Command output:
  (empty)

Work dir:
  /Users/tj/MOP2/mop_preprocess/work/9c/5ec7843013aa19e7fd91a7b75427be

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Time elapsed: 14.7s
Execution status: failed
WARN: Killing pending tasks (1)

Note that I get this error on a fresh install of Nextflow, Guppy 6.0.1, docker, and Master of Pores 2 on both my Mac and a Linux workstation we have in the lab. Are you using Guppy 6.0.1 or an earlier version? Perhaps I should try getting and older version of Guppy and running it again with the test dataset?

Edit: I got the same error using a clean download of MOP2 and Guppy 3.4.5, using the default dataset. Running Docker 4.5.0, Guppy 6.0.1 (also tried 3.4.5), Nextflow 21.10.6, latest Master of Pores 2, bash version 3.2.57(1)-release, Java 1.8.0_312 / Java Runtime Environment (Zulu 8.58.0.13-CA-macosx), and on a Mac OS X 12.2.1 machine with M1.

lucacozzuto commented 2 years ago

Mmm, try removing "sudo" and add -profile local, the default config maybe is asking for too many resources.

Whatsacb commented 2 years ago

If I remove sudo and run nextflow run mop_preprocess.nf -profile local -with-docker > log.txt I get the following response:

.nextflow/history.lock (Permission denied)

If I run sudo nextflow run mop_preprocess.nf -profile local -with-docker > log.txt I get the following:

executor >  local (3)
[05/277d9a] process > flow1:GUPPY_BASECALL:baseCall (wt---2)       [100%] 2 of 2, failed: 2 ✘
[-        ] process > flow1:NANOQ_FILTER:filter                    -
[-        ] process > preprocess_flow:MinIONQC                     -
[-        ] process > preprocess_flow:RNA2DNA                      -
[-        ] process > preprocess_flow:GRAPHMAP:map                 -
[-        ] process > preprocess_flow:SAMTOOLS_CAT:catAln          -
[-        ] process > preprocess_flow:SAMTOOLS_SORT:sortAln        -
[-        ] process > preprocess_flow:SAMTOOLS_INDEX:indexBam      -
[e7/e3c2ac] process > preprocess_flow:checkRef (Checking yeast_... [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram                     -
[-        ] process > preprocess_flow:bam2stats                    -
[-        ] process > preprocess_flow:joinAlnStats                 -
[-        ] process > preprocess_flow:NANOPLOT_QC:MOP_nanoPlot     -
[-        ] process > preprocess_flow:concatenateFastQFiles        -
[-        ] process > preprocess_flow:FASTQC:fastQC                -
[-        ] process > preprocess_flow:NANOCOUNT:nanoCount          -
[-        ] process > preprocess_flow:AssignReads                  -
[-        ] process > preprocess_flow:countStats                   -
[-        ] process > preprocess_flow:joinCountStats               -
[-        ] process > preprocess_flow:MULTIQC:makeReport           -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (mod---1)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (mod---1)` terminated with an error exit status (126)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./mod---1_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1       --num_callers  1
  cat mod---1_out/*.fastq >> mod---1.fastq
  rm mod---1_out/*.fastq
  gzip mod---1.fastq

Command exit status:
  126

Command output:
  (empty)

Command error:
  251f5509d51d: Download complete
  8e829fe70a46: Verifying Checksum
  8e829fe70a46: Download complete
  6001e1789921: Verifying Checksum
  6001e1789921: Download complete
  9f0a21d58e5d: Verifying Checksum
  9f0a21d58e5d: Download complete
  a0529eb74f28: Verifying Checksum
  a0529eb74f28: Download complete
  47b91ac70c27: Verifying Checksum
  47b91ac70c27: Download complete
  575767043cd0: Verifying Checksum
  575767043cd0: Download complete
  a7718005f676: Verifying Checksum
  a7718005f676: Download complete
  7f82c77da127: Verifying Checksum
  7f82c77da127: Download complete
  7875f439735c: Verifying Checksum
  7875f439735c: Download complete
  35c102085707: Verifying Checksum
  35c102085707: Download complete
  35c102085707: Pull complete
  251f5509d51d: Pull complete
  8e829fe70a46: Pull complete
  6001e1789921: Pull complete
  9f0a21d58e5d: Pull complete
  47b91ac70c27: Pull complete
  a0529eb74f28: Pull complete
  6c99e7cdc7cd: Verifying Checksum
  6c99e7cdc7cd: Download complete
  b7e5caf187fb: Verifying Checksum
  b7e5caf187fb: Download complete
  b7e5caf187fb: Pull complete
  575767043cd0: Pull complete
  a7718005f676: Pull complete
  7f82c77da127: Pull complete
  72b066e9aff3: Verifying Checksum
  72b066e9aff3: Download complete
  7875f439735c: Pull complete
  e0c99ace5c30: Verifying Checksum
  e0c99ace5c30: Download complete
  31c15b68a65f: Download complete
  31c15b68a65f: Pull complete
  6c99e7cdc7cd: Pull complete
  e0c99ace5c30: Pull complete
  72b066e9aff3: Pull complete
  Digest: sha256:dcc9a2a786bb52428a5de236cb7f4b910551fc418a9438762decdd83d545cc61
  Status: Downloaded newer image for biocorecrg/mopbasecall:0.2
  WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
  .command.sh: line 2: /Users/tj/MOP2/mop_preprocess/bin/guppy_basecaller: cannot execute binary file: Exec format error

Work dir:
  /Users/tj/MOP2/mop_preprocess/work/97/9150eb6de8ac1fb943dbba90ce2b11

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Failed to invoke `workflow.onComplete` event handler

 -- Check script 'mop_preprocess.nf' at line: 631 or see '.nextflow.log' file for more details
WARN: Killing pending tasks (1)

Perhaps I should try your suggestion again on Monday on the Linux machine and see if it works?

lucacozzuto commented 2 years ago

well now is because without the sudo you cannot override that file. You need to remove .nextflow/history.lock and other files like that before running it again

Whatsacb commented 2 years ago

Okay so a couple of things that I found. The first is that I was having loads of permissions issues with my installation, and I think it is because when I went to execute sh INSTALL.sh I did the commands manually, since I was on a Macintosh computer and that script is written for Linux. When I did this, I probably used a sudo command somewhere. I reinstalled everything, but instead I made a new INSTALL.sh script for Macintosh users:

 #!/bin/bash
#
# This script will download and install guppy
# params:
# 1 guppy version

if [ x"$1" == x ]; then
        GUPPY_VER='3.4.5'
else
    GUPPY_VER=$1
fi

wget https://mirror.oxfordnanoportal.com/software/analysis/ont-guppy-cpu_${GUPPY_VER}_osx64.zip
if [ $? -eq 0 ]; then
    echo "INSTALLING GUPPY VERSION ${GUPPY_VER}"
else
    echo "GUPPY VERSION ${GUPPY_VER} is not found" 
    exit
fi

unzip ont-guppy-cpu_${GUPPY_VER}_osx64.zip
mv ont-guppy-cpu mop_preprocess/bin/
cd mop_preprocess/bin
ln -s ont-guppy-cpu/bin/guppy_* .
ln -s ont-guppy-cpu/lib/* .
cd ../../
if [ ! -e "mop_preprocess/bin/ont-guppy-cpu/lib/libz.so" ] ; then
        unlink mop_preprocess/bin/ont-guppy-cpu/lib/libz.so
        cd mop_preprocess/bin/ont-guppy-cpu/lib/
        ln -s libz.so.1 libz.so
        cd ../../../../
fi
rm ont-guppy-cpu_${GUPPY_VER}_osx64.zip

Note a couple of differences here from the default INSTALL.sh script:

Once I redownloaded the MOP2 repo and executed this command, all my permissions were fixed and I was able to run nextflow without the "sudo" command (i.e. the ".nextflow/history.lock" error).

Next, per your request, I tried running mop_preprocess.nf in docker on the default dataset via the command:

nextflow run mop_preprocess.nf -profile local -with-docker > log.txt

Which gave the following result in the logs:

executor >  local (3)
[0c/f516a2] process > flow1:GUPPY_BASECALL:baseCall (wt---2)                   [100%] 1 of 1, failed: 1
[-        ] process > flow1:NANOQ_FILTER:filter                                -
[-        ] process > preprocess_flow:MinIONQC                                 -
[-        ] process > preprocess_flow:RNA2DNA                                  -
[-        ] process > preprocess_flow:GRAPHMAP:map                             -
[-        ] process > preprocess_flow:SAMTOOLS_CAT:catAln                      -
[-        ] process > preprocess_flow:SAMTOOLS_SORT:sortAln                    -
[-        ] process > preprocess_flow:SAMTOOLS_INDEX:indexBam                  -
[-        ] process > preprocess_flow:checkRef (Checking yeast_rRNA_ref.fa.gz) -
[-        ] process > preprocess_flow:bam2Cram                                 -
[-        ] process > preprocess_flow:bam2stats                                -
[-        ] process > preprocess_flow:joinAlnStats                             -
[-        ] process > preprocess_flow:NANOPLOT_QC:MOP_nanoPlot                 -
[-        ] process > preprocess_flow:concatenateFastQFiles                    -
[-        ] process > preprocess_flow:FASTQC:fastQC                            -
[-        ] process > preprocess_flow:NANOCOUNT:nanoCount                      -
[-        ] process > preprocess_flow:AssignReads                              -
[-        ] process > preprocess_flow:countStats                               -
[-        ] process > preprocess_flow:joinCountStats                           -
[-        ] process > preprocess_flow:MULTIQC:makeReport                       -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (mod---1)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (mod---1)` terminated with an error exit status (126)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./mod---1_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1       --num_callers  1
  cat mod---1_out/*.fastq >> mod---1.fastq
  rm mod---1_out/*.fastq
  gzip mod---1.fastq

Command exit status:
  126

Command output:
  (empty)

Command error:
  WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
  .command.sh: line 2: /Users/tj/MOP2/mop_preprocess/bin/guppy_basecaller: cannot execute binary file: Exec format error

Work dir:
  /Users/tj/MOP2/mop_preprocess/work/b4/93225634afaefac821cfba507831de

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`



Which, if you look back to my previous post, is the same error. The lines WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested .command.sh: line 2: /Users/tj/MOP2/mop_preprocess/bin/guppy_basecaller: cannot execute binary file: Exec format error is something I overlooked. Is the issue that I am trying to process this on a Mac machine with an arm processor (M1), and ONT hasn't updated guppy for it yet? I may try a clean install on a Linux machine and see if it works.

lucacozzuto commented 2 years ago

Hi... you should not use any executable designed for mac. MoP2 uses linux containers that are based on Linux OS. So your mac will run a number of linux virtual containers and will need linux compatible executables.
I tested MoP on my MacBook too, so I won't expect any problem. Please go back to the original script and use bash INSTALL.sh instead of sh INSTALL.sh but without using SUDO. Actually, you should never use SUDO for running MOP2.

Whatsacb commented 2 years ago

Well now, I feel like an idiot! That would make sense if I was running it all through Linux containers wouldn't it? I don't know why I didn't think about that. I was so concerned with running it properly on Mac that I was focused on downloading all the Mac associated software instead of the Linux versions (like docker), and so I did that for Guppy as well... which doesn't make sense in this context.

I reinstalled MOP2 without sudo and using the linux versions of Guppy and it seems to be working fine now (at least on the data included in MOP2 by default). Thank you so much for all your help!!!

Whatsacb commented 1 year ago

Hi Luca,

It seemed like I was able to get by using the .fastq files, but now I need to process the .fast5 files and have run into the same issue. Interestingly, I was able to run MOP2 fine on some old data from a previous run, but not with some new data I obtained using an updated MinKNOW software from ONT. We're using a MinION Mk1B if that helps.

Also, instead of using a Mac I swapped to a linux computer. Here's the specs:

System: Ubuntu 22.04.1 LTS Nextflow Version: 22.04.5.5708 Guppy Version: 3.4.5 (MOP2 Default)

MOP2 Preprocess config file:

params {
    conffile            = "/media/tj/Silver/Research/Nanopore/data/U87-WT-12-20-21/final_summary_FAR91556_a92bf75a.txt"
    fast5               = "/media/tj/Silver/Research/Nanopore/data/U87-WT-12-20-21/**/*.fast5"
    fastq               = ""

    reference           = "/media/tj/Silver/Research/Nanopore/anno/GRCh38.p13.genome.fa"
    annotation          = "/media/tj/Silver/Research/Nanopore/anno/gencode.v35.annotation.gtf"
    ref_type            = "genome"

    pars_tools          = "drna_tool_unsplice_guppy6_opt.tsv" 
    output              = "/media/tj/Silver/Research/Nanopore/output/mp_output/WT_122021_New"
    qualityqc           = 5
    granularity         = 1

    basecalling         = "guppy"
    GPU                 = "OFF"
    demultiplexing      = "NO"
    demulti_fast5       = "NO" 

    filtering           = "nanoq"

    mapping             = "minimap2"
    counting            = "NO"
    discovery           = "NO"

    cram_conv           = "YES"
    subsampling_cram    = 50

    saveSpace           = "NO"

    email               = ""
}

The following output I get using either the drna_tool_splice_opt.tsv or the drna_tool_unsplice_guppy6_opt.tsv pars_tools file:

Launching `mop_preprocess.nf` [wise_magritte] DSL2 - revision: ec40fe0af4

╔╦╗╔═╗╔═╗  ╔═╗┬─┐┌─┐┌─┐┬─┐┌─┐┌─┐┌─┐┌─┐┌─┐
║║║║ ║╠═╝  ╠═╝├┬┘├┤ ├─┘├┬┘│ ││  ├┤ └─┐└─┐
╩ ╩╚═╝╩    ╩  ┴└─└─┘┴  ┴└─└─┘└─┘└─┘└─┘└─┘

====================================================
BIOCORE@CRG Master of Pores 2. Preprocessing - N F  ~  version 2.0
====================================================

conffile.                 : /media/tj/Silver/Research/Nanopore/data/U87-WT-12-20-21/no_sample/20211220_1710_MN33054_FAR91556_fc4e9c0a/final_summary_FAR91556_a92bf75a.txt

fast5                     : /media/tj/Silver/Research/Nanopore/data/U87-WT-12-20-21/no_sample/20211220_1710_MN33054_FAR91556_fc4e9c0a/**/*.fast5
fastq                     : 

reference                 : /media/tj/Silver/Research/Nanopore/anno/GRCh38.p13.genome.fa
annotation                : /media/tj/Silver/Research/Nanopore/anno/gencode.v35.annotation.gtf

granularity.              : 1

ref_type                  : genome
pars_tools                : drna_tool_unsplice_guppy6_opt.tsv

output                    : /media/tj/Silver/Research/Nanopore/output/mp_output/WT_122021_New

GPU                       : OFF

basecalling               : guppy 
demultiplexing            : NO 
demulti_fast5             : NO

filtering                 : nanoq
mapping                   : minimap2

counting                  : NO
discovery                 : NO

cram_conv                 : YES
subsampling_cram          : 50

saveSpace                 : NO
email                     : 

Skipping the email

----------------------CHECK TOOLS -----------------------------
basecalling : guppy
> demultiplexing will be skipped
mapping : minimap2
filtering : nanoq
> counting will be skipped
> discovery will be skipped
--------------------------------------------------------------
[-        ] process > flow1:GUPPY_BASECALL:baseCall  -
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[-        ] process > preprocess_flow:checkRef       -
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (2)
[94/978a28] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 43
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process >params {
    conffile            = "/media/tj/Silver/Research/Nanopore/data/U87-WT-12-20-21/final_summary_FAR91556_a92bf75a.txt"
    fast5               = "/media/tj/Silver/Research/Nanopore/data/U87-WT-12-20-21/**/*.fast5"
    fastq               = ""

    reference           = "/media/tj/Silver/Research/Nanopore/anno/GRCh38.p13.genome.fa"
    annotation          = "/media/tj/Silver/Research/Nanopore/anno/gencode.v35.annotation.gtf"
    ref_type            = "genome"

    pars_tools          = "drna_tool_unsplice_guppy6_opt.tsv" 
    output              = "/media/tj/Silver/Research/Nanopore/output/mp_output/WT_122021_New"
    qualityqc           = 5
    granularity         = 1

    basecalling         = "guppy"
    GPU                 = "OFF"
    demultiplexing      = "NO"
    demulti_fast5       = "NO" 

    filtering           = "nanoq"

    mapping             = "minimap2"
    counting            = "NO"
    discovery           = "NO"

    cram_conv           = "YES"
    subsampling_cram    = 50

    saveSpace           = "NO"

    email               = ""
}```
 preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[33/8b1b63] process > preprocess_flow:checkRef (C... [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (2)
[94/978a28] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 182
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[33/8b1b63] process > preprocess_flow:checkRef (C... [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (2)
[94/978a28] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 330
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[33/8b1b63] process > preprocess_flow:checkRef (C... [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (2)
[94/978a28] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 330
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[33/8b1b63] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (3)
[05/6a6a00] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 330
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[33/8b1b63] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_fail---11)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_fail---11)` terminated with an error exit status (139)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./         --save_path ./fast5_fail---11_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5_fail---11_out/*.fastq >> fast5_fail---11.fastq
  rm fast5_fail---11_out/*.fastq
  gzip fast5_fail---11.fastq

Command exit status:
  139

Command output:
  Unexpected option '--disable_qscore_filtering' found on command line.

  ONT Guppy basecalling software version 3.4.5+fb1fbfb
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_fail---11_out
  chunk size:         1000
  chunks per runner:  512
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 2017 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:

  .command.sh: line 2:     8 Segmentation fault      (core dumped) guppy_basecaller --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./ --save_path ./fast5_fail---11_out --gpu_runners_per_device 1 --cpu_threads_per_caller 1 --num_callers 8

Work dir:
  /media/tj/Blackrock/MOP2/mop_preprocess/work/94/978a28df60d8d5a7d737f8983bf45f

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Pipeline BIOCORE@CRG Master of Pore - preprocess completed!
Started at  2022-10-15T16:33:03.559-07:00
Finished at 2022-10-15T16:33:16.581-07:00
Time elapsed: 13s
Execution status: failed
WARN: Killing running tasks (1)

executor >  local (3)
[05/6a6a00] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 1 of 329, failed: 1
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[33/8b1b63] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_fail---11)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_fail---11)` terminated with an error exit status (139)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./         --save_path ./fast5_fail---11_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5_fail---11_out/*.fastq >> fast5_fail---11.fastq
  rm fast5_fail---11_out/*.fastq
  gzip fast5_fail---11.fastq

Command exit status:
  139

Command output:
  Unexpected option '--disable_qscore_filtering' found on command line.

  ONT Guppy basecalling software version 3.4.5+fb1fbfb
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_fail---11_out
  chunk size:         1000
  chunks per runner:  512
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 2017 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:

  .command.sh: line 2:     8 Segmentation fault      (core dumped) guppy_basecaller --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./ --save_path ./fast5_fail---11_out --gpu_runners_per_device 1 --cpu_threads_per_caller 1 --num_callers 8

Work dir:
  /media/tj/Blackrock/MOP2/mop_preprocess/work/94/978a28df60d8d5a7d737f8983bf45f

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

If I navigate to that working directory, this is what I see:

image

This is what the input data directory looks like as well:

image

With my fast5 files in the fast5_pass and fast5_fail folders respectively. If needed, this is also what my conffile looks like as well:

instrument=MN33054
position=
flow_cell_id=FAR91556
sample_id=no_sample
protocol_group_id=U87-WT-12-20-21
protocol=sequencing/sequencing_MIN106_RNA:FLO-MIN106:SQK-RNA002
protocol_run_id=fc4e9c0a-9cbd-40b7-a1c3-f74800bfc258
acquisition_run_id=a92bf75a860b6357f6a397232239aa7533b416d6
started=2021-12-20T17:12:17.680273-08:00
acquisition_stopped=2021-12-21T17:13:10.132254-08:00
processing_stopped=2021-12-21T17:13:26.237590-08:00
basecalling_enabled=1
sequencing_summary_file=sequencing_summary_FAR91556_a92bf75a.txt
fast5_files_in_final_dest=330
fast5_files_in_fallback=0
fastq_files_in_final_dest=330
fastq_files_in_fallback=0

Not sure how to proceed. I am running MOP2 with docker. Again, I am able to run this on older data that stores all of the fast5 files in one location. I know that since the update, fast5_pass and fast5_fail files are separated, so I'm not sure how to analyze it now.

Thanks for all your help!

Best, TJ

lucacozzuto commented 1 year ago

Hi, the older guppy does not have the option for separating the reads. So you get this error: Unexpected option '--disable_qscore_filtering' found on command line. With older versions you cannot use the same tool_opts file, you have to use one without that option.

Whatsacb commented 1 year ago

Thanks Dr. Cozzuto, I appreciate all the help. I made a new pars_tools file with the --disable_qscore_filtering option, and it appears to be basecalling now. I am following an old protocol from a former graduate student who used the first Master of Pores repo to detect m6A modifications in RNA reads, so I have to update it for MOP2.

I do, however, seem to have a new issue. While it basecalls, minimap2 appears to get stuck at 0% and then the whole thing freezes. Initially I thought it was just running slow, but I let it run overnight and this was the result (I omitted the middle portion of the log to keep it succinct):

N E X T F L O W  ~  version 21.10.6
Launching `mop_preprocess.nf` [cheeky_church] - revision: ecf5447b25

╔╦╗╔═╗╔═╗  ╔═╗┬─┐┌─┐┌─┐┬─┐┌─┐┌─┐┌─┐┌─┐┌─┐
║║║║ ║╠═╝  ╠═╝├┬┘├┤ ├─┘├┬┘│ ││  ├┤ └─┐└─┐
╩ ╩╚═╝╩    ╩  ┴└─└─┘┴  ┴└─└─┘└─┘└─┘└─┘└─┘

====================================================
BIOCORE@CRG Master of Pores 2. Preprocessing - N F  ~  version 2.0
====================================================

conffile                  : /home/nanopore/MOP2/data/U87_WT_122021/final_summary_FAR91556_a92bf75a.txt

fast5                     : /home/nanopore/MOP2/data/U87_WT_122021/**/*.fast5
fastq                     : 

reference                 : /home/nanopore/MOP2/anno/GRCh38.p13.genome.fa
annotation                : /home/nanopore/MOP2/anno/gencode.v35.annotation.gtf

granularity               : 1

ref_type                  : genome
pars_tools                : drna_m6a_opt.tsv

output                    : /home/nanopore/MOP2/output/mp_output/U87_WT_122021_fast5processed

GPU                       : OFF

basecalling               : guppy 
demultiplexing            : NO 
demulti_fast5             : NO

filtering                 : nanoq
mapping                   : minimap2

counting                  : NO
discovery                 : NO

cram_conv                 : YES
subsampling_cram          : 50

saveSpace                 : NO

email                     : 

Skipping the email

----------------------CHECK TOOLS -----------------------------
basecalling : guppy
> demultiplexing will be skipped
mapping : minimap2
filtering : nanoq
> counting will be skipped
> discovery will be skipped
--------------------------------------------------------------
...
executor >  local (41)
[5f/1aaea9] process > flow1:GUPPY_BASECALL:baseCa... [  5%] 18 of 330
[dd/05aa34] process > flow1:NANOQ_FILTER:filter (... [100%] 18 of 18
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   [  0%] 0 of 18
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[2b/7f3161] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_pass---18)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_pass---18)` terminated with an error exit status (125)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./         --save_path ./fast5_pass---18_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5_pass---18_out/*.fastq >> fast5_pass---18.fastq
  rm fast5_pass---18_out/*.fastq
  gzip fast5_pass---18.fastq

Command exit status:
  125

Command output:
  Unexpected option '--disable_qscore_filtering' found on command line.

  ONT Guppy basecalling software version 3.4.5+fb1fbfb
  config file:        /home/nanopore/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /home/nanopore/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_pass---18_out
  chunk size:         1000
  chunks per runner:  512
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1626 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:

  /bin/bash: line 3:     7 Terminated              /bin/bash -ue .command.sh
  time="2022-10-17T14:43:51-07:00" level=error msg="error waiting for container: unexpected EOF"

Work dir:
  /home/nanopore/MOP2/mop_preprocess/work/6e/80dd3432828f51525e38e552ce440c

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Pipeline BIOCORE@CRG Master of Pore - preprocess completed!
Started at  2022-10-17T11:27:02.433-07:00
Finished at 2022-10-17T14:43:51.329-07:00
Time elapsed: 3h 16m 49s
Execution status: failed
WARN: Killing pending tasks (3)

executor >  local (41)
[5f/1aaea9] process > flow1:GUPPY_BASECALL:baseCa... [  5%] 19 of 327, failed: 1
[dd/05aa34] process > flow1:NANOQ_FILTER:filter (... [100%] 18 of 18
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   [  0%] 0 of 18
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[2b/7f3161] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_pass---18)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_pass---18)` terminated with an error exit status (125)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 --disable_qscore_filtering -i ./         --save_path ./fast5_pass---18_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5_pass---18_out/*.fastq >> fast5_pass---18.fastq
  rm fast5_pass---18_out/*.fastq
  gzip fast5_pass---18.fastq

Command exit status:
  125

Command output:
  Unexpected option '--disable_qscore_filtering' found on command line.

  ONT Guppy basecalling software version 3.4.5+fb1fbfb
  config file:        /home/nanopore/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /home/nanopore/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_pass---18_out
  chunk size:         1000
  chunks per runner:  512
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1626 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:

  /bin/bash: line 3:     7 Terminated              /bin/bash -ue .command.sh
  time="2022-10-17T14:43:51-07:00" level=error msg="error waiting for container: unexpected EOF"

Work dir:
  /home/nanopore/MOP2/mop_preprocess/work/6e/80dd3432828f51525e38e552ce440c

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

For reference, my parse tools file dana_m6a_opt.tsv looks like so:

#step tool extrapars
basecalling guppy "--disable_qscore_filtering"
demultiplexing deeplexicon "-f multi -m resnet20-final.h5"
demultiplexing guppy "-m pAmps-final-actrun_newdata_nanopore_UResNet20v2_model.030.h5"
filtering nanofilt ""
filtering nanoq "" 
mapping graphmap ""
mapping graphmap2 "-x rnaseq"
mapping minimap2 "-ax splice -uf -k14"
mapping bwa "" 
counting htseq "-a 0"
counting nanocount ""
discovery bambu ""

Any help or input you have is deeply appreciated (I can also post this as a separate issue). Thanks!

Update: I also tried resetting some of the parameters back to defaults, removing the -m pAmps-final-actrun_newdata_nanopore_UResNet20v2_model.030.h5 in demultiplexing and changing mapping minimap2 back to the default mapping minimap2 "-uf -ax splice -k14", which thus far seems to yield the same result.

lucacozzuto commented 1 year ago

Hi,

the version of guppy you are using

|ONT Guppy basecalling software version 3.4.5+fb1fbfb|

is not supporting that option |-disable_qscore_filtering|

Luca

Whatsacb commented 1 year ago

I see, Thanks Luca.

I have a question then going forward: I am running the most recent MinKNOW core (5.3.1) which comes with Guppy 6.3.8 (the most recent version). I intend to basecall later and look for m6A modifications using EpiNano, which I believe has models trained on Guppy 3.1.5. When running the MinKNOW core to sequence, should I save the output only as fast5 files and disable filtering (see image below)? Then, run the fast5 files through MOP2 mop_preprocess and guppy 3.1.5 (using bash INSTALL.sh 3.1.5 when installing MOP2)?

Output screen MinKNOW

Also, with regard to the data I have currently (which is separated into fast5_fail and fast5_pass), am I simply unable to process these data since they've already been sorted?

Thanks!

lucacozzuto commented 1 year ago

Hi, yes it is recommended to re baseball if you want to use epinano. Having files already separated is not working with MoP since it expects everything in one file, you should concatenate them.

Best,

Luca

Whatsacb commented 1 year ago

Hi Luca,

I concatenated all of the fast5 pass and fast5 fail files into one as you suggested. I basically placed all of them into one folder and then executed cat $(ls -t) > FAR91556_master.fast5 while in the subdirectory. Rerunning mop_preprocess.nf with this new file results in the following:

N E X T F L O W ~ version 22.10.1 Launching mop_preprocess.nf [romantic_avogadro] DSL2 - revision: ec40fe0af4

╔╦╗╔═╗╔═╗  ╔═╗┬─┐┌─┐┌─┐┬─┐┌─┐┌─┐┌─┐┌─┐┌─┐
║║║║ ║╠═╝  ╠═╝├┬┘├┤ ├─┘├┬┘│ ││  ├┤ └─┐└─┐
╩ ╩╚═╝╩    ╩  ┴└─└─┘┴  ┴└─└─┘└─┘└─┘└─┘└─┘

====================================================
BIOCORE@CRG Master of Pores 2. Preprocessing - N F  ~  version 2.0
====================================================

conffile.                 : /media/tj/Blackrock/MOP2/data/FAR91556/final_summary_FAR91556_a92bf75a.txt

fast5                     : /media/tj/Blackrock/MOP2/data/FAR91556/fast5_cat/FAR91556_master.fast5
fastq                     : 

reference                 : /media/tj/Blackrock/MOP2/anno/GRCh38.p13.GCLUC.genome.fa
annotation                : /media/tj/Blackrock/MOP2/anno/GRCh38.p13.GCLUC.genome.gtf

granularity.              : 1

ref_type                  : genome
pars_tools                : drna_tool_splice_opt.tsv

output                    : /media/tj/Blackrock/MOP2/output/mp_output/FAR91556_fast5proc

GPU                       : OFF

basecalling               : guppy 
demultiplexing            : NO 
demulti_fast5             : NO

filtering                 : nanoq
mapping                   : minimap2

counting                  : NO
discovery                 : NO

cram_conv                 : YES
subsampling_cram          : 50

saveSpace                 : NO
email                     : terryprins@mednet.ucla.edu

Sending the email to terryprins@mednet.ucla.edu

----------------------CHECK TOOLS -----------------------------
basecalling : guppy
> demultiplexing will be skipped
mapping : minimap2
filtering : nanoq
> counting will be skipped
> discovery will be skipped
--------------------------------------------------------------
executor >  local (2)
[25/d2f9d1] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 1
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[f7/62cfbf] process > preprocess_flow:checkRef (C... [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (2)
[25/d2f9d1] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 1
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[f7/62cfbf] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (2)
[25/d2f9d1] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 1
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[f7/62cfbf] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_cat---1)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_cat---1)` terminated with an error exit status (139)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./fast5_cat---1_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5_cat---1_out/*.fastq >> fast5_cat---1.fastq
  rm fast5_cat---1_out/*.fastq
  gzip fast5_cat---1.fastq

Command exit status:
  139

Command output:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_cat---1_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1195 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_cat---1_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1195 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|
  .command.sh: line 2:     8 Segmentation fault      (core dumped) guppy_basecaller --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 -i ./ --save_path ./fast5_cat---1_out --gpu_runners_per_device 1 --cpu_threads_per_caller 1 --num_callers 8

Work dir:
  /media/tj/Blackrock/MOP2/mop_preprocess/work/25/d2f9d1f42381f1d44cc1e5641ff121

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

Pipeline BIOCORE@CRG Master of Pore - preprocess completed!
Started at  2022-12-05T12:02:33.962793297-08:00
Finished at 2022-12-05T12:02:41.772994878-08:00
Time elapsed: 7.8s
Execution status: failed
Failed to invoke `workflow.onComplete` event handler

 -- Check script 'mop_preprocess.nf' at line: 632 or see '.nextflow.log' file for more details

executor >  local (2)
[25/d2f9d1] process > flow1:GUPPY_BASECALL:baseCa... [100%] 1 of 1, failed: 1 ✘
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[f7/62cfbf] process > preprocess_flow:checkRef (C... [100%] 1 of 1 ✔
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... [  0%] 0 of 1
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5_cat---1)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5_cat---1)` terminated with an error exit status (139)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./fast5_cat---1_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5_cat---1_out/*.fastq >> fast5_cat---1.fastq
  rm fast5_cat---1_out/*.fastq
  gzip fast5_cat---1.fastq

Command exit status:
  139

Command output:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_cat---1_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1195 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5_cat---1_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1195 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|
  .command.sh: line 2:     8 Segmentation fault      (core dumped) guppy_basecaller --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 -i ./ --save_path ./fast5_cat---1_out --gpu_runners_per_device 1 --cpu_threads_per_caller 1 --num_callers 8

Work dir:
  /media/tj/Blackrock/MOP2/mop_preprocess/work/25/d2f9d1f42381f1d44cc1e5641ff121

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

If this is a guppy issue rather than a mop_preprocess issue, I can also reach out to ONT on their support forum as well. Any input or suggestions you have are most welcome.

Thanks again for all your help.

lucacozzuto commented 1 year ago

Hi, sorry I was referring to concatenate fastq files not fast5. For fast5 I think you can just put all of them in a folder which name will be your sample and fed it to master of pores.

Whatsacb commented 1 year ago

Hi Luca,

Thanks for all your help. If I put all of the fast5 files into one folder and run mop_preprocess on that, I get the same result:

N E X T F L O W  ~  version 22.10.1
Launching `mop_preprocess.nf` [romantic_wright] DSL2 - revision: ec40fe0af4

╔╦╗╔═╗╔═╗  ╔═╗┬─┐┌─┐┌─┐┬─┐┌─┐┌─┐┌─┐┌─┐┌─┐
║║║║ ║╠═╝  ╠═╝├┬┘├┤ ├─┘├┬┘│ ││  ├┤ └─┐└─┐
╩ ╩╚═╝╩    ╩  ┴└─└─┘┴  ┴└─└─┘└─┘└─┘└─┘└─┘

====================================================
BIOCORE@CRG Master of Pores 2. Preprocessing - N F  ~  version 2.0
====================================================

conffile.                 : /media/tj/Blackrock/MOP2/data/FAR91556/final_summary_FAR91556_a92bf75a.txt

fast5                     : /media/tj/Blackrock/MOP2/data/FAR91556/fast5/*.fast5
fastq                     : 

reference                 : /media/tj/Blackrock/MOP2/anno/GRCh38.p13.GCLUC.genome.fa
annotation                : /media/tj/Blackrock/MOP2/anno/GRCh38.p13.GCLUC.genome.gtf

granularity.              : 1

ref_type                  : genome
pars_tools                : drna_tool_splice_opt.tsv

output                    : /media/tj/Blackrock/MOP2/output/mp_output/FAR91556_fast5proc

GPU                       : OFF

basecalling               : guppy 
demultiplexing            : NO 
demulti_fast5             : NO

filtering                 : nanoq
mapping                   : minimap2

counting                  : NO
discovery                 : NO

cram_conv                 : YES
subsampling_cram          : 50

saveSpace                 : NO
email                     : terryprins@mednet.ucla.edu

Sending the email to terryprins@mednet.ucla.edu

----------------------CHECK TOOLS -----------------------------
basecalling : guppy
> demultiplexing will be skipped
mapping : minimap2
filtering : nanoq
> counting will be skipped
> discovery will be skipped
--------------------------------------------------------------
executor >  local (3)
[09/f46eb6] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 117
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[-        ] process > preprocess_flow:checkRef       [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (4)
[89/969df5] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 314
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[-        ] process > preprocess_flow:checkRef       [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -

executor >  local (5)
[b2/79b72c] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 0 of 314
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[-        ] process > preprocess_flow:checkRef       [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5---7)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5---7)` terminated with an error exit status (139)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./fast5---7_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5---7_out/*.fastq >> fast5---7.fastq
  rm fast5---7_out/*.fastq
  gzip fast5---7.fastq

Command exit status:
  139

Command output:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5---7_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1169 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5---7_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1169 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|
  .command.sh: line 2:     8 Segmentation fault      (core dumped) guppy_basecaller --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 -i ./ --save_path ./fast5---7_out --gpu_runners_per_device 1 --cpu_threads_per_caller 1 --num_callers 8

Work dir:
  /media/tj/Blackrock/MOP2/mop_preprocess/work/09/f46eb63261460299561c22e1ab2b61

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Pipeline BIOCORE@CRG Master of Pore - preprocess completed!
Started at  2022-12-05T20:57:32.871228407-08:00
Finished at 2022-12-05T20:57:43.065421503-08:00
Time elapsed: 10.2s
Execution status: failed
Failed to invoke `workflow.onComplete` event handler

 -- Check script 'mop_preprocess.nf' at line: 632 or see '.nextflow.log' file for more details
WARN: Killing running tasks (4)

executor >  local (5)
[b2/79b72c] process > flow1:GUPPY_BASECALL:baseCa... [  0%] 1 of 310, failed: 1
[-        ] process > flow1:NANOQ_FILTER:filter      -
[-        ] process > preprocess_flow:MinIONQC       -
[-        ] process > preprocess_flow:MINIMAP2:map   -
[-        ] process > preprocess_flow:SAMTOOLS_CA... -
[-        ] process > preprocess_flow:SAMTOOLS_SO... -
[-        ] process > preprocess_flow:SAMTOOLS_IN... -
[-        ] process > preprocess_flow:checkRef       [  0%] 0 of 1
[-        ] process > preprocess_flow:bam2Cram       -
[-        ] process > preprocess_flow:bam2stats      -
[-        ] process > preprocess_flow:joinAlnStats   -
[-        ] process > preprocess_flow:NANOPLOT_QC... -
[-        ] process > preprocess_flow:concatenate... -
[-        ] process > preprocess_flow:FASTQC:fastQC  -
[-        ] process > preprocess_flow:MULTIQC:mak... -
Error executing process > 'flow1:GUPPY_BASECALL:baseCall (fast5---7)'

Caused by:
  Process `flow1:GUPPY_BASECALL:baseCall (fast5---7)` terminated with an error exit status (139)

Command executed:

  guppy_basecaller          --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002  -i ./         --save_path ./fast5---7_out         --gpu_runners_per_device 1         --cpu_threads_per_caller 1         --num_callers  8
  cat fast5---7_out/*.fastq >> fast5---7.fastq
  rm fast5---7_out/*.fastq
  gzip fast5---7.fastq

Command exit status:
  139

Command output:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5---7_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1169 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|

Command error:
  ONT Guppy basecalling software version 3.1.5+781ed57
  config file:        /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg
  model file:         /media/tj/Blackrock/MOP2/mop_preprocess/bin/ont-guppy/data/template_rna_r9.4.1_70bps_hac.jsn
  input path:         ./
  save path:          ./fast5---7_out
  chunk size:         1000
  chunks per runner:  1000
  records per file:   4000
  num basecallers:    8
  cpu mode:           ON
  threads per caller: 1

  Found 1 fast5 files to process.
  Init time: 1169 ms

  0%   10   20   30   40   50   60   70   80   90   100%
  |----|----|----|----|----|----|----|----|----|----|
  .command.sh: line 2:     8 Segmentation fault      (core dumped) guppy_basecaller --fast5_out --flowcell FLO-MIN106 --kit SQK-RNA002 -i ./ --save_path ./fast5---7_out --gpu_runners_per_device 1 --cpu_threads_per_caller 1 --num_callers 8

Work dir:
  /media/tj/Blackrock/MOP2/mop_preprocess/work/09/f46eb63261460299561c22e1ab2b61

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

A few bits of information that might also be useful:

  1. I have another raw dataset of fast5 files that was run using this exact same pipeline/setup just fine, but those files were obtained using an older version of ONT's MinKNOW software. Between that run and this one, the nanopore minKNOW software was updated and now contains a native Guppy 6.3.8. I presume that this isn't an issue for basecalling through MOP2 at a later time with an older Guppy version, so long as the fast5 files are not separated into fail / pass reads. In the future, I will disable qscore filtering when I process the physical samples using the nanopore, but it would be useful if I could sort out the issue and get the data from this run too.
  2. I installed MOP2 with Guppy 3.1.5 (via "bash INSTALL.sh 3.1.5") instead of the default 3.4.5. The reason being is that I plan to compare process my samples individually by comparing them to the EpiNano model provided within the EpiNano repo, which was basecalled on Guppy 3.1.5. They state I should use the same Guppy version if I am going to do this.
  3. My system specs, just in case you need them:
OS: Ubuntu 22.04.1 LTS
CPU: Intel i9-13900K 24-Core
RAM: 128 GB DDR4 3600 MHz
GPU: Nvidia 3090 Ti
SSD: 4 TB

I am also using docker.

Best, TJ

lucacozzuto commented 1 year ago

Hi, I just see a Segmentation fault (core dumped) error... so not so sure what is going wrong with Guppy... this seems something related to that tool rather than to the pipeline. I don't think I can help more on this. Maybe @ADelgadoT can have some idea...

Whatsacb commented 1 year ago

Alright. I can also post on Oxford Nanopore's forums with regard to Guppy. Thanks for letting me know and helping me trouble shoot.

Just to confirm though, going forward, I shouldn't have an issue using the pipeline if the raw data was acquired on the latest MinKNOW software? So long as the raw data is stored as .fast5 files with qscore filtering disabled? Thanks!

lucacozzuto commented 1 year ago

Yes. I'm not aware of any problems about this. Let me know what is the answer from the forum!

Whatsacb commented 1 year ago

Hi Luca,

So I tried running the standalone version of Guppy 3.1.5 outside of Master of Pores (just to see if I get the same error) and I also got a Segmentation Fault. Using the latest version of Guppy (6.4.2) seemed to work fine. I posted on the ONT forum and didn't get a response, but a discussion with ONT's support team yielded the following:

Late last year, we made some changes to the fast5 files (see link below), likely impacting backward compatibility with older Guppy but not the forward compatibility.. as you observed with your data.

https://community.nanoporetech.com/posts/proposed-changes-to-the-fa

I did find an article that list out a number of available tool designed to detect RNA modifications. While some of the tools listed in here might still be based on older versions of Guppy, but it might be worth taking a look and see if there is something that can work for your project.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9650216/#s5title

Hope this information is helpful. Please let me know if you have any questions.

The link to the first article is the following post:

Dear Nanopore Community

In an upcoming release of MinKNOW (currently projected to be 21.10 – we'll announce the precise version as soon as it's been finalised) we are looking to make some changes to the fast5 and fastq files. Prior to implementing this we want to ensure you are aware and can make any necessary changes to your analysis pipelines.

Fast5 Update

Based on our understanding of how fast5 files are used, the primary use cases are archival storage and re-basecalling with updated/alternative models. Additions such as custom signal-based analyses, and modified base analysis are secondary use cases. For this fast5 update we are planning to retain support for the primary use case but remove support for the secondary use cases as this analysis is available elsewhere.

For signal-based analyses which often require the "move table" to map signal to sequence, but since this mapping is approximate there is often a subsequent refinement step in those analyses. Approximate mapping can be obtained by linear interpolation between signal and sequence, or alternatively the move table can be regenerated on the fly using the pyguppy client. Further instructions will be provided.

Modified base outputs are now available in the BAM file output (aligned or unaligned) and we recommend all users migrate to that format for future work.

We are making this change to help duplication of data and is useful for an upcoming improvement for detection and splitting of chimeric reads.

So it looks like .fast5 files generated from the latest MinKNOW software (packaged with Guppy 6.3.8 or later) is not backwards compatible with older versions of Guppy. Therefore, if you generate your .fast5 files using a newer MinKNOW/Guppy version, you may not be able to process it with Guppy 3. You will either need to install a more recent version of Guppy with MOP2 or downgrade MinKNOW to a version pre-21.10.

Thanks again for all your help Luca. Hopefully this provides some insight in case anyone else runs into the same issue.

P.S. Standalone Guppy 3.1.5 doesn't work well with modern linux computers because 32-bit libraries (such as libidn.so.11) are not included with 64-bit linux. I had to manually download them and place them in the correct folders to get it to work without using a container.