ModelStoreTF exception <class 'tensorflow.python.framework.errors_impl.InternalError'>

Hi, I found such a problem in using medaka, do you have any idea of how to solve it? I will be very grateful if you could provide some help. Thank you in advance.

Describe the bug I am using medaka to generate consensus with initial assembly from flye and chopper filtered reads and I have such a bug:

[18:14:04 - MdlStrTF] ModelStoreTF exception <class 'tensorflow.python.framework.errors_impl.InternalError'>
Traceback (most recent call last):
  File "/home/zfp_da03/anaconda3/envs/medaka/bin/medaka", line 11, in <module>
    sys.exit(main())
  File "/home/zfp_da03/anaconda3/envs/medaka/lib/python3.10/site-packages/medaka/medaka.py", line 814, in main
    args.func(args)
  File "/home/zfp_da03/anaconda3/envs/medaka/lib/python3.10/site-packages/medaka/prediction.py", line 167, in predict
    remainder_regions_depth = run_prediction(
  File "/home/zfp_da03/anaconda3/envs/medaka/lib/python3.10/site-packages/medaka/prediction.py", line 47, in run_prediction
    class_probs = model.predict_on_batch(x_data)
  File "/home/zfp_da03/anaconda3/envs/medaka/lib/python3.10/site-packages/keras/engine/training.py", line 2474, in predict_on_batch
    outputs = self.predict_function(iterator)
  File "/home/zfp_da03/anaconda3/envs/medaka/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/zfp_da03/anaconda3/envs/medaka/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: Graph execution error:

Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 3, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 256, 128, 1, 10000, 100, 0] 
     [[{{node CudnnRNN}}]]
     [[sequential/bidirectional_1/backward_gru2/PartitionedCall]] [Op:__inference_predict_function_3295]
Failed to run medaka consensus.

The script I used:

# Define the number of cores to use
current_hour=$(date +"%H")
if [ $current_hour -ge 9 ] && [ $current_hour -lt 17 ]; then
    processcores=16
else
    processcores=48
fi
echo "Number of cores to use: $processcores"

# Define the model to use for medaka polishing
model="r1041_e82_400bps_sup_v4.3.0"

# Set the environment variable to allow GPU growth
export TF_FORCE_GPU_ALLOW_GROWTH=true

echo "Perform polishing with initial assembly from raw reads for sample: $sample"

# Define the path to the initial assembly from raw reads
raw_assembly_path="/ten_TB/Hua/Nanopore_hybrid/fastq_pass/${sample}/${sample}_flye_raw/assembly.fasta"
# Define the path to the raw reads
raw_reads_path="/ten_TB/Hua/Nanopore_hybrid/fastq_pass/${sample}/${sample}_merged.fastq.gz"
# Define the path to the output directory
output_dir="/ten_TB/Hua/Nanopore_hybrid/fastq_pass/${sample}/${sample}_medaka_raw"
# Perform the polishing with initial assembly from raw reads
medaka_consensus \
-i $raw_reads_path \
-d $raw_assembly_path \
-o $output_dir \
-t $processcores \
-m $model

Logging Please attach any relevant logging messages. (Use ``` before and after code blocks).

Environment (if you do not have a GPU, write No GPU):

Installation method: conda
OS: Ubuntu 20.04
medaka version: 1.11.3

GPU model：


Tue Mar 26 19:53:03 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10              Driver Version: 535.86.10    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3070        On  | 00000000:AF:00.0 Off |                  N/A |
|  0%   40C    P8              25W / 220W |      2MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+

 - CUDA version

cuda-version 11.8 h70ddcb2_2 conda-forge cudatoolkit 11.8.0 h4ba93d1_13 conda-forge cudnn 8.8.0.121 hcdd5f01_4 conda-forge

nanoporetech / medaka

ModelStoreTF exception <class 'tensorflow.python.framework.errors_impl.InternalError'> #501