ksahlin / NGSpeciesID

Reference-free clustering and consensus forming of long-read amplicon sequencing
GNU General Public License v3.0
49 stars 14 forks source link

subprocess.CalledProcessError: Command 'medaka_consensus' #1

Closed ksahlin closed 3 years ago

ksahlin commented 4 years ago

The command calling medaka throws an error:

Traceback (most recent call last):

  File "/usr/local/bin/NGSpeciesID", line 283, in <module>

    main(args)

  File "/usr/local/bin/NGSpeciesID", line 155, in main

    consensus.run_medaka(all_reads_file, center_file, medaka_outfolder, "1", args.medaka_model)

  File "/usr/local/lib/python3.7/site-packages/modules/consensus.py", line 103, in run_medaka

    subprocess.check_call(['medaka_consensus', '-i', reads_to_center, "-d", center, "-o", outfolder, "-t", cores, "-m", medaka_model], stdout=output_file, stderr=medaka_stderr)

  File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/subprocess.py", line 363, in check_call

    raise CalledProcessError(retcode, cmd)

subprocess.CalledProcessError: Command '['medaka_consensus', '-i', './ngspouts_liverbiduck/reads_to_consensus_1.fasta', '-d', './ngspouts_liverbiduck/consensus_reference_1.fasta', '-o', './ngspouts_liverbiduck/medaka_cl_id_1', '-t', '1', '-m', 'r941_min_high_g330']' returned non-zero exit status 1.
ksahlin commented 4 years ago

This is the third-party tool medaka throwing an error and could be because of several reasons, such as conflicting versions of medaka dependencies.

To find your specific reason, issue the command throwing the error in the terminal directly. In the above example that would mean:

medaka_consensus  -i ./ngspouts_liverbiduck/reads_to_consensus_1.fasta \
                  -d ./ngspouts_liverbiduck/consensus_reference_1.fasta \
                 -o ./ngspouts_liverbiduck/medaka_cl_id_1 -t 1 -m r941_min_high_g330

In this case it returned:

(NGSpeciesID) WCSs-MacBook-Pro:ngspouts_liverbiduck$ medaka_consensus -i reads_to_consensus_1.fasta -d consensus_reference_1.fasta -o medaka_cl_id_1 -t 1 -m r941_min_high_g330
Checking program versions
This is medaka 0.11.5
Program    Version    Required   Pass     
bgzip      1.9        1.9        True     
minimap2   2.17       2.11       True     
samtools   1.9        1.9        True     
tabix      1.9        1.9        True     
Aligning basecalls to draft
Removing previous index file /Users/-----/Desktop/NGSpeciesID_test/ngspouts_liverbiduck/consensus_reference_1.fasta.mmi
Removing previous index file /Users/-----/Desktop/NGSpeciesID_test/ngspouts_liverbiduck/consensus_reference_1.fasta.fai
Constructing minimap index.
[M::mm_idx_gen::0.001*4.31] collected minimizers
[M::mm_idx_gen::0.001*3.45] sorted minimizers
[M::main::0.004*1.90] loaded/built the index for 1 target sequence(s)
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.004*1.89] distinct minimizers: 74 (100.00% are singletons); average occurrences: 1.000; average spacing: 6.027
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -I 16G -x map-ont --MD -d /Users/------/Desktop/NGSpeciesID_test/ngspouts_liverbiduck/consensus_reference_1.fasta.mmi /Users/------/Desktop/NGSpeciesID_test/ngspouts_liverbiduck/consensus_reference_1.fasta
[M::main] Real time: 0.004 sec; CPU: 0.008 sec; Peak RSS: 0.002 GB
[M::main::0.004*1.47] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.004*1.46] mid_occ = 2
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.004*1.45] distinct minimizers: 74 (100.00% are singletons); average occurrences: 1.000; average spacing: 6.027
[M::worker_pipeline::0.030*0.92] mapped 208 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -x map-ont --MD -t 1 -a /Users/------/Desktop/NGSpeciesID_test/ngspouts_liverbiduck/consensus_reference_1.fasta.mmi /Users/-----/Desktop/NGSpeciesID_test/ngspouts_liverbiduck/reads_to_consensus_1.fasta
[M::main] Real time: 0.031 sec; CPU: 0.029 sec; Peak RSS: 0.003 GB
Running medaka consensus
[11:53:23 - Predict] Processing region(s): consensus_cl_id_1_total_supporting_reads_208:0-446
[11:53:23 - Predict] Setting tensorflow threads to 1.
[11:53:23 - Predict] Processing 1 long region(s) with batching.
[11:53:23 - Predict] Using model: /anaconda3/envs/NGSpeciesID/lib/python3.6/site-packages/medaka/data/r941_min_high_g330_model.hdf5.
[11:53:23 - ModelLoad] Building model with cudnn optimization: False
[11:53:24 - DLoader] Initializing data loader
[11:53:24 - PWorker] Running inference for 0.0M draft bases.
[11:53:24 - Sampler] Initializing sampler for consensus of region consensus_cl_id_1_total_supporting_reads_208:0-446.
[11:53:24 - Feature] Pileup counts do not span requested region, requested consensus_cl_id_1_total_supporting_reads_208:0-446, received 3-443.
OMP: Error #15: Initializing libomp.dylib, but found libiomp5.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/
/anaconda3/envs/NGSpeciesID/bin/medaka_consensus: line 127:  1093 Abort trap: 6           medaka consensus ${CALLS2DRAFT}.bam ${CONSENSUSPROBS} --model ${MODEL} --batch_size ${BATCH_SIZE} --threads ${THREADS}
Failed to run medaka consensus.

One fix that solved this was to install nomkl as:

conda install nomkl

Based on replies from this issue: https://github.com/openai/spinningup/issues/16

ksahlin commented 4 years ago

Another error is this one:

(NGSpeciesID) dhcp-wlan:testNGSpeciesID Ninja$ medaka_consensus -i roedeer/reads_to_consensus_1.fasta -d roedeer/consensus_reference_1.fasta -o roedeer/medaka_cl_id_1 -t 1 -m r941_min_high_g330
Traceback (most recent call last):
  File "/Users/Ninja/anaconda3/envs/NGSpeciesID/bin/medaka", line 7, in <module>
    from medaka.medaka import main
  File "/Users/Ninja/anaconda3/envs/NGSpeciesID/lib/python3.6/site-packages/medaka/medaka.py", line 12, in <module>
    import medaka.features
  File "/Users/Ninja/anaconda3/envs/NGSpeciesID/lib/python3.6/site-packages/medaka/features.py", line 17, in <module>
    import medaka.labels
  File "/Users/Ninja/anaconda3/envs/NGSpeciesID/lib/python3.6/site-packages/medaka/labels.py", line 14, in <module>
    import medaka.rle
  File "/Users/Ninja/anaconda3/envs/NGSpeciesID/lib/python3.6/site-packages/medaka/rle.py", line 12, in <module>
    from ont_fast5_api.fast5_interface import get_fast5_file
ModuleNotFoundError: No module named 'ont_fast5_api.fast5_interface'

It means that medaka was not installed properly on Mac (i.e. without openblas). Reinstall medaka with the instuctions in the README, i.e. conda install --yes -c conda-forge -c bioconda medaka openblas==0.3.3 spoa.

ksahlin commented 4 years ago

Third error is this one

 medaka_consensus -i /Users/kxs624/tmp/sample_h1/reads_to_consensus_17.fasta -d /Users/kxs624/tmp/sample_h1/consensus_reference_17.fasta -o /Users/kxs624/tmp/sample_h1/medaka_cl_id_17 -t 1
readlink: illegal option -- f
usage: readlink [-n] [file ...]

The solution is to install a newer version of medaka during installation of NGSpeciesID

conda install --yes -c conda-forge -c bioconda medaka==0.11.5 openblas==0.3.3 spoa

This has now been included in NGSpeciesID installation instructions

edgardomortiz commented 4 years ago

I had used a more recent basecalling model thant the default in medaka v0.11.5, and this installation formula worked perfectly too:

conda create -n ngspeciesid -c conda-forge -c bioconda python=3.6 pip "parasail-python>=1.1.10" "edlib>=1.1.2" python-edlib "medaka>=1.0.2" spoa racon minimap2 mmseqs2
conda activate ngspeciesid
pip install NGSpeciesID
ksahlin commented 4 years ago

Hi Edgardo,

Please:

  1. Identify the row in the output that reads subprocess.CalledProcessError: Command '['medaka_consensus', '-i', 'X', '-d', 'Y'', '-o', 'Z', '-t', '1', '-m', 'r941_min_high_g330']' returned non-zero exit status 1.
  2. rerun it directly from terminal as medaka_consensus -i X -d Y -o Z -t 1 -m r941_min_high_g330
  3. Post the error message of this run here.

X,Y,Z are your paths.

edgardomortiz commented 4 years ago

I didn't get any error, I was just sharing the installation formula that allowed me to use the most recent medaka v.1.0.3 together with NGSpeciesID, sorry for the confusion by hijacking the thread.