f90 / Wave-U-Net

Implementation of the Wave-U-Net for audio source separation
MIT License
824 stars 177 forks source link

libcublas.so.9.0: cannot open shared object file: No such file or directory #6

Closed ghost closed 5 years ago

ghost commented 5 years ago

I'm not sure what's happening. here's the log:

Traceback (most recent call last): File "Predict.py", line 3, in import Evaluate File "/home/user/Wave-U-Net-master/Evaluate.py", line 2, in import tensorflow as tf File "/usr/local/lib/python2.7/dist-packages/tensorflow/init.py", line 24, in from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/init.py", line 49, in from tensorflow.python import pywrap_tensorflow File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in raise ImportError(msg) ImportError: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in from tensorflow.python.pywrap_tensorflow_internal import * File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in _pywrap_tensorflow_internal = swig_import_helper() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace above this error message when asking for help.

f90 commented 5 years ago

It looks like TF can't find the CUDA libraries. Are you planning to use a GPU?

If not, install tensorflow python package, not tensorflow-gpu. Then it should work.

If you want to use a GPU, you need to install the matching CUDA and CuDNN libraries. TF 1.8 needs CUDA 9. For CuDNN, I am using 7.3.1 at the moment and it's working. Follow the instructions on the TF website to install everything properly. Make sure that at the end, the LD_LIBRARY_PATH environment variable points to both the CUDA and the cuDNN libary paths.

ghost commented 5 years ago

I'll just use the CPU. I'll be patient.

I attempted to make it use the multi_instrument set by doing this:

python Predict.py with cfg.full_multi_instrument input_path="path here"

and it gave an error.

ghost commented 5 years ago

I edited the Predict.py file and changed it so it points at the multi_instrument folder and such, and it still used the vocal separation dataset... this is confusing. I just closed the issue by accident... oops

f90 commented 5 years ago

What error does it give you exactly if you execute

python Predict.py with cfg.full_multi_instrument input_path="path here"

? I can't help you if you are not more specific... Normally you don't need to modify the Predict.py file at all, by default it will use the pre-trained model, IF you have installed it correctly (so there should be a checkpoints folder in the repository folder, in it a baseline_stereo folder, and in there the checkpoint files), and save the output right to where the input audio is.

f90 commented 5 years ago

Is this issue resolved/obsolete by now? Please give me a short notice what the status is on this. If I don't hear back from you I will close the issue.

ghost commented 5 years ago

Ah, very sorry. I'll get back to you within an hour with the exact error (I have Peppermint OS - Ubuntu-based distro on a different HDD) and I haven't used that hard drive in a few days. I'll get back to you in any event!

ghost commented 5 years ago

python Predict.py with cfg.full_multi_instrument input_path="/home/user/Push.mp3" /usr/local/lib/python2.7/dist-packages/scikits/audiolab/soundio/play.py:48: UserWarning: Could not import alsa backend; most probably, you did not have alsa headers when building audiolab warnings.warn("Could not import alsa backend; most probably, " Training multi-instrument separation with best model WARNING - Waveunet Prediction - No observers have been added to this run INFO - Waveunet Prediction - Running command 'main' INFO - Waveunet Prediction - Started Producing source estimates for input mixture file /home/user/Push.mp3 Testing... Num of variables56 INFO:tensorflow:Restoring parameters from checkpoints/baseline_stereo/baseline_stereo-186093 INFO - tensorflow - Restoring parameters from checkpoints/baseline_stereo/baseline_stereo-186093 2018-11-14 07:58:28.578623: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key separator/conv1d_26/bias not found in checkpoint ERROR - Waveunet Prediction - Failed after 0:00:03! Traceback (most recent calls WITHOUT Sacred internals): File "Predict.py", line 17, in main Evaluate.produce_source_estimates(model_config, model_path, input_path, output_path) File "/home/user/Wave-U-Net-master/Evaluate.py", line 196, in produce_source_estimates sources_pred = predict(track, model_config, load_model) # Input track to prediction function, get source estimates File "/home/user/Wave-U-Net-master/Evaluate.py", line 70, in predict restorer.restore(sess, load_model) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1802, in restore {self.saver_def.filename_tensor_name: save_path}) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) NotFoundError: Key separator/conv1d_26/bias not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op u'save/RestoreV2', defined at: File "Predict.py", line 14, in @ex.automain File "/usr/local/lib/python2.7/dist-packages/sacred/experiment.py", line 137, in automain self.run_commandline() File "/usr/local/lib/python2.7/dist-packages/sacred/experiment.py", line 260, in run_commandline return self.run(cmd_name, config_updates, named_configs, {}, args) File "/usr/local/lib/python2.7/dist-packages/sacred/experiment.py", line 209, in run run() File "/usr/local/lib/python2.7/dist-packages/sacred/run.py", line 221, in call self.result = self.main_function(args) File "/usr/local/lib/python2.7/dist-packages/sacred/config/captured_function.py", line 46, in captured_function result = wrapped(args, **kwargs) File "Predict.py", line 17, in main Evaluate.produce_source_estimates(model_config, model_path, input_path, output_path) File "/home/user/Wave-U-Net-master/Evaluate.py", line 196, in produce_source_estimates sources_pred = predict(track, model_config, load_model) # Input track to prediction function, get source estimates File "/home/user/Wave-U-Net-master/Evaluate.py", line 68, in predict restorer = tf.train.Saver(None, write_version=tf.train.SaverDef.V2) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1338, in init self.build() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1347, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1384, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 835, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 472, in _AddRestoreOps restore_sequentially) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 886, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2 shape_and_slices=shape_and_slices, dtypes=dtypes, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Key separator/conv1d_26/bias not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

f90 commented 5 years ago
INFO:tensorflow:Restoring parameters from checkpoints/baseline_stereo/baseline_stereo-186093
INFO - tensorflow - Restoring parameters from checkpoints/baseline_stereo/baseline_stereo-186093

My mistake - I forgot that by default, the baseline_stereo model checkpoint file is loaded by the Predict.py function, so if you change the model configuration to multi_instrument, the checkpoint and the network configuration do not match. It tries to load the vocal separation model, but expects multi-instrument model weights. That's why you later get the error.

So in your case you should use

python Predict.py with cfg.full_multi_instrument input_path="/home/user/Push.mp3" model_path="checkpoints/full_multi_instrument/full_multi_instrument-134067"

assuming your model checkpoint file resides where it should be (within the checkpoints/full_multi_instrument subfolder), otherwise adapt accordingly.

I will update the README to make this clearer when using other pretrained models except the standard one.

ghost commented 5 years ago

This fixed it! Thanks a ton.