acids-ircam / ddsp_pytorch

Implementation of Differentiable Digital Signal Processing (DDSP) in Pytorch
Apache License 2.0
451 stars 56 forks source link

Work-around for preprocessing failing with: `INTERNAL: Failed initializing math mode` #35

Open bluenote10 opened 2 years ago

bluenote10 commented 2 years ago

I was trying to run the preprocessing, but it failed with a Tensorflow error at the step when its trying to apply crepe. The error is somewhat cryptic (INTERNAL: Failed initializing math mode) with a similar traceback:

Full traceback ``` 2022-09-04 14:34:30.720585: E tensorflow/stream_executor/cuda/cuda_blas.cc:197] failed to set new cublas math mode: CUBLAS_STATUS_INVALID_VALUE 2022-09-04 14:34:30.720614: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:438 : INTERNAL: Failed initializing math mode /home/fabian/Dropbox/Temp/ReferenceAudio/Landola/Snapshot1/recording_normalized.mp3: 0%| | 0/1 [00:34 main() File "preprocess.py", line 73, in main x, p, l = preprocess(f, **config["preprocess"]) File "preprocess.py", line 27, in preprocess pitch = extract_pitch(x, sampling_rate, block_size) File "/home/fabian/git/_ext/ddsp_pytorch/ddsp/core.py", line 101, in extract_pitch f0 = crepe.predict( File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/crepe/core.py", line 255, in predict activation = get_activation(audio, sr, model_capacity=model_capacity, File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/crepe/core.py", line 212, in get_activation return model.predict(frames, verbose=verbose) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.InternalError: Graph execution error: Detected at node 'model/classifier/MatMul' defined at (most recent call last): File "preprocess.py", line 91, in main() File "preprocess.py", line 73, in main x, p, l = preprocess(f, **config["preprocess"]) File "preprocess.py", line 27, in preprocess pitch = extract_pitch(x, sampling_rate, block_size) File "/home/fabian/git/_ext/ddsp_pytorch/ddsp/core.py", line 101, in extract_pitch f0 = crepe.predict( File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/crepe/core.py", line 255, in predict activation = get_activation(audio, sr, model_capacity=model_capacity, File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/crepe/core.py", line 212, in get_activation return model.predict(frames, verbose=verbose) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 2033, in predict tmp_batch_outputs = self.predict_function(iterator) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 1845, in predict_function return step_function(self, iterator) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 1834, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 1823, in run_step outputs = model.predict_step(data) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 1791, in predict_step return self(x, training=False) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 490, in __call__ return super().__call__(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/base_layer.py", line 1014, in __call__ outputs = call_fn(inputs, *args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 92, in error_handler return fn(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/functional.py", line 458, in call return self._run_internal_graph( File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/functional.py", line 596, in _run_internal_graph outputs = node.layer(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/engine/base_layer.py", line 1014, in __call__ outputs = call_fn(inputs, *args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 92, in error_handler return fn(*args, **kwargs) File "/home/fabian/.virtualenvs/ddsp_pytorch/lib/python3.8/site-packages/keras/layers/core/dense.py", line 221, in call outputs = tf.matmul(a=inputs, b=self.kernel) Node: 'model/classifier/MatMul' Failed initializing math mode [[{{node model/classifier/MatMul}}]] [Op:__inference_predict_function_723] ```

I researched this a bit, and it looks like this relates to the following upstream issue for Tensorflow: https://github.com/tensorflow/tensorflow/issues/57359

I'm mainly opening this issue to inform others of a simple work-around in case they run into this as well. It is possible to simply place a dummy import tensorflow at the beginning of preprocessing.py to avoid the issue. Apparently the issue has something to do with importing pytorch first, followed by tensorflow, which messes up Tensorflow's GPU initialization, and running the Tensorflow import first seems to avoid that problem.