nanoporetech / DTR-phage-pipeline

Mozilla Public License 2.0
16 stars 6 forks source link

scipy RecursionError #9

Closed GeoMicroSoares closed 2 years ago

GeoMicroSoares commented 2 years ago

Hi there,

Running the pipeline I run into a recursion error (see part of it below) that is related to scipy. I've tried setting the recursion limit in the conda environment created during the snakemake run to 100000 (sys.setrecursionlimit(100000)) but with no success. Can you provide any ideas to fix this @jmeppley? Let me know what other information I can provide. Thanks!

  File "/mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 3614, in _dendrogram_calculate_info
    _dendrogram_calculate_info(
  File "/mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 3581, in _dendrogram_calculate_info
    _dendrogram_calculate_info(
  File "/mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 3581, in _dendrogram_calculate_info
    _dendrogram_calculate_info(
  File "/mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 3511, in _dendrogram_calculate_info
    _append_singleton_leaf_node(Z, p, n, level, lvs, ivl,
  File "/mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 3385, in _append_singleton_leaf_node
    ivl.append(str(int(i)))
RecursionError: maximum recursion depth exceeded while getting the str of an object
[Tue Jul  6 12:49:37 2021]
Error in rule bin_ava_clustering:
    jobid: 1414
    output: /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/146/146.clust.heatmap.png, /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/146/146.clust.info.csv
    conda-env: /mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054
    shell:
        python /mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/scripts/cluster_ava_alignments.py -p /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/146/146.clust  -s 0.8 -n 5 /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/146/146.ava.paf
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

If it's of any use, some of the reads on the .paf file start with a number e.g. 3a93a918-197a-42e1-bb8f-c58fdbdb2b16 and I've noticed recursion errors may happen with scipy due to this?

GeoMicroSoares commented 2 years ago

just an update - had to add a line to cluster_ava_alignments.py increasing the recursion limit for python before running the rest of the script and that helped (currently running): sys.setrecursionlimit(10000). This may overload other machines, but working thus far on the server I'm working with.

jmeppley commented 2 years ago

That's a good solution. We should add a configuration option to let you do that without modifying the code.

In the meantime, for anyone who doesn't want to muck about in the code, or if you don't have the computing resources to handle really complicated clusters, the "solution" we came up with with was to skip clusters that threw these errors. See the README for the SKIP_BINS configuration option:

GeoMicroSoares commented 2 years ago

Hi again @jmeppley just ran into a segmentation fault while the pipeline tried to run cluster_ava_alignments.py on bin -1 (unbinned reads). This even though it's included in my config.yml that that bin should be skipped:

SKIP_BINS:
    sample:
        1D:
            v1: [-1]
            v2: [-1]

Is there a way to forcing this somehow? Error message below. Thanks!

python /mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/scripts/cluster_ava_alignments.py -p /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/-1/-1.clust  -s 0.8 -n 5 /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/-1/-1.ava.paf
Activating conda environment: /mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/.snakemake/conda/60ec611e186987fffbc0b287faa5a054
/bin/bash: line 1: 16405 Segmentation fault      (core dumped) python /mnt/XIO_4_via_NFS/test_dtr/DTR-phage-pipeline/scripts/cluster_ava_alignments.py -p /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/-1/-1.clust -s 0.8 -n 5 /data6/test_dtr/sample/1D/v1/kmer_binning/refine_bins/alignments/-1/-1.ava.paf

Quick edit: Only have v1 running - commented v2 out of the config.yml but without change, bin -1 still being processed. --keep-going doesn't work after the seg fault..

jmeppley commented 2 years ago

Yeah, that seems odd to me. Can you send your entire config file?

GeoMicroSoares commented 2 years ago

Sure thing, follows below:

$ cat ../test_dtr/DTR-phage-pipeline/config.yml
#####################
# Editing mandatory #
#####################

# Run details. These primarily determine where the output ends up (see output_root below)
sample: "sample"
stype: "1D"
version: "v1"

# Path of the read (*.fasta) & sequencing summary files you want to analyze
input_fasta: /data6/test_dtr/merged_SM_L_sup.fasta
input_summary: /data6/test_dtr/sequencing_summary.txt

# Path where you would like pipeline results to be written.
# Run outputs will end up in in the directory: <output_root>/<sample>/<stype>/<version>/
output_root: /data6/test_dtr/

max_threads: 32 # Maximum number of threads the pipeline will attempt to use
pre_filter: DTR # Only process reads with DTR (None: all reads, Virsorter: virsorter)
                #  virsorter is purely experimental at this point

# Medaka model depends on whether MinION or PromethION used for sequencing and which Guppy version
# was used for basecalling. Please see Medaka documentation for list of available models.
MEDAKA:
    model: 'r941_min_high_g303'
    threads: 4 # Should be << max_threads so that multiple genomes can be polished in parallel

KAIJU:
    db: /data6/databases/kaijudb
    switch: '-a greedy -e 5' # max 5 substitutions and allows substitutions; slower but more sensitive than default params
    run: True                # Set to False to skip kaiju
                             # will also be skipped if DB is missing

# Specify which k-mer bin to skip for each sample/run/version. HDBSCAN assigns all unbinned reads to bin_id = -1,
# so this bin should always be skipped. But occasionally other 'bad' bins require skipping if they generate downstream
# errors in the pipeline.
SKIP_BINS:
    sample:
        1D:
            v1: [-1]
#            v2: [-1]

####################
# Editing optional #
####################

DTR_FIND:
    ovlp: 20 # Percent of read start/end to check for alignment
    chunksize: 5000

VIRSORTER:
    db: 1             # Either "1" (DEFAULT Refseqdb) or "2" (Viromedb)
    diamond: True     # False: blastp, True: diamond
    virome: False     # True: input is a virome, don't train on data
    prophage: False   # False: only use phage, True: use prohpage, too
    data_dir: '/mnt/arginine/galaxydata/seqdbs/virsorter-data'
                      # location of virsorter-data

KMER_CALC:
    k: 5      # k-mer size to use for k-mer binning
    cli: '-c' # Use raw k-mer counts instead of normalized k-mer frequencies (best for genome-spanning reads)

UMAP:
    min_rl: 5000       # Minimum readlength to consider in analysis
    max_cb_rl: 60000   # Maximum read length to include in colorbar for UMAP scatterplot annotated by read length
    min_d: 0.1         # min_dist paramter for UMAP
    neighbors: 15      # n_neighbors parameter for UMAP
    min_q: 9           # Filtering out low qscore reads prior to running UMAP improves resolution of bins
    bin_min_reads: 10  # Don't create a bin if the bin contains < bin_min_reads
    scatter_size: 7    # Marker size for UMAP scatter plots
    scatter_alpha: 0.2 # Marker alpha for UMAP scatter plots

BIN_AVA_ALIGN:
    MINIMAP2:
        cli: '-x ava-ont --no-long-join -r100'
        threads: 1
    CLUST:
        min_score_frac: 0.8
        min_reads: 5

DTR_ALIGN:
    ovlp: 20   # Percent of read start/end to check for alignment
    threads: 1 # Single thread is optimum for aligning single read start/end

NUCMER:
    mincov: 98.0 # Genome-genome percent coverage threshold for deduplicating polished genomes
    minpct: 98.0 # Genome-genome percent identity thresholf for deduplicating polished genomes

MINIMAP2:
    STANDARD:
        switches: '-x map-ont'
        threads: 4

RACON:
    repeat: 3 # Number of iterations to run
    include-unpolished: True
    window-length: 500
    quality-threshold: 9
    error-threshold: 0.3
    match: 5
    mismatch: -4
    gap: -8
    threads: 4
    fragment-correction: True ## should not contain dual/self overlaps

PORECHOP:
    switches: '--no_split'

PRODIGAL:
    threads: 4

LIN_CONCAT:
    ovlp_len: 3000
    minlen: 15000
jmeppley commented 2 years ago

It looks like a past update broke the SKIP_BINS functionality by getting strings and ints mixed up. I'm working on a fix and also adding the option to override the python max recursion depth if your system has the resources.

jmeppley commented 2 years ago

@GeoMicroSoares, let me know if the latest code works for you

GeoMicroSoares commented 2 years ago

Hey @jmeppley , all works perfectly, run was just over! Just had to do a couple quick edits to .yml files:

envs/medaka-0.11.0.yml:

envs/racon.yml:

Hope this helps other people and thanks so much for your help!