Problems to test-model with cli

asr-lord commented 3 years ago

I get the following error when I try to test my trained model:

user@user:~/deep-speaker$ export CUDA_VISIBLE_DEVICES=0; python cli.py test-model --working_dir ~/.deep-speaker-wd/triplet-training/ --checkpoint_file checkpoint
s-triplets/ResCNN_checkpoint_37.h5

2021-08-05 08:36:39,966 - INFO - Picking audio from /root/.deep-speaker-wd/triplet-training.
Initializing the batcher:   0%|                                                                                                                                    | 0/4 [00:00<?, ?it/s]2021-08-05 08:36:40.033047: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-08-05 08:36:40.051675: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3600000000 Hz
2021-08-05 08:36:40.376533: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-08-05 08:36:40.649327: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-08-05 08:36:40.650659: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
Initializing the batcher: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.59it/s]
test:   0%|                                                                                                                                                       | 0/80 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "cli.py", line 99, in <module>
    cli()
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "cli.py", line 70, in test_model
    test(working_dir, checkpoint_file)
  File "/home/diarization/deep-speaker/test.py", line 67, in test
    fm, tpr, acc, eer = eval_model(working_dir, model=dsm)
  File "/home/diarization/deep-speaker/test.py", line 40, in eval_model
    predictions = model.m.predict(input_data, batch_size=BATCH_SIZE)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1629, in predict
    tmp_batch_outputs = self.predict_function(iterator)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 862, in _call
    results = self._stateful_fn(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 2941, in __call__
    filtered_flat_args) = self._maybe_define_function(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 3358, in _maybe_define_function
    args, kwargs, flat_args, filtered_flat_args, cache_key_context)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 3280, in _define_function_with_shape_relaxation
    args, kwargs, override_flat_arg_shapes=relaxed_arg_shapes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1478 predict_function  *
        return step_function(self, iterator)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1468 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:1259 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2730 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:3417 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1461 run_step  **
        outputs = model.predict_step(data)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1434 predict_step
        return self(x, training=False)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:998 __call__
        input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/input_spec.py:207 assert_input_compatibility
        ' input tensors. Inputs received: ' + str(inputs))

    ValueError: Layer ResCNN expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 160, 64, 1) dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=(None, 1) dtype=float32>]

My libraries versions:

Keras==2.4.3
tensorflow-gpu==2.4.1

philipperemy commented 3 years ago

@asr-lord hummm. First did the examples and the CLI work with the checkpoint I provided in the README (with your python configuration)?

asr-lord commented 3 years ago

@asr-lord hummm. First did the examples and the CLI work with the checkpoint I provided in the README (with your python configuration)?

In both cases didn't work: with the checkpoint and with my trained model.

philipperemy commented 3 years ago

@asr-lord can you paste all the commands to reproduce the issue?

asr-lord commented 3 years ago

@asr-lord can you paste all the commands to reproduce the issue?

./deep-speaker download_librispeech 
./deep-speaker build_mfcc
export CUDA_VISIBLE_DEVICES=0; python cli.py test-model --working_dir ~/.deep-speaker-wd/triplet-training/ --checkpoint_file checkpoint
s-triplets/ResCNN_checkpoint_37.h5

philipperemy commented 3 years ago

Thank you

philipperemy commented 2 years ago

@asr-lord sorry for the late reply. There's definitely something. You are right. I'm trying to find a way to fix it.

philipperemy commented 2 years ago

Okay that's a tensorflow issue. If you use tensorflow==2.3 it works. It breaks after tensorflow 2.4. It's a bit weird because the example in the README works well with tensorflow 2.7.

philipperemy commented 2 years ago

I will close this issue and mention it in the README.

philipperemy / deep-speaker

Problems to test-model with cli #89