Rudrabha / Lip2Wav

This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"
MIT License
699 stars 153 forks source link

Error while training the model on any data #28

Closed noamkatz closed 3 years ago

noamkatz commented 3 years ago

hello all, im trying the get the model to train on the same data provided by the authors, I was able to recreate the results using the pre-trained weights and everything worked pretty much fine. I am using Windows and running tf on GPU. The output I am getting is as follows-

Exception in thread background: Traceback (most recent call last): File "C:\Users\admin\Documents\temp\lib\threading.py", line 917, in _bootstrap_inner self.run() File "C:\Users\admin\Documents\temp\lib\threading.py", line 865, in run self._target(*self._args, *self._kwargs) File "C:\Users\admin\Desktop\Lip2Wav-master\Lip2Wav-master\synthesizer\feeder.py", line 139, in _enqueue_next_train_group examples = [self._get_next_example() for i in range(n _batches_per_group)] File "C:\Users\admin\Desktop\Lip2Wav-master\Lip2Wav-master\synthesizer\feeder.py", line 139, in examples = [self._get_next_example() for i in range(n * _batches_per_group)] File "C:\Users\admin\Desktop\Lip2Wav-master\Lip2Wav-master\synthesizer\feeder.py", line 194, in _get_next_example input_data, mel_target = self.getitem() File "C:\Users\admin\Desktop\Lip2Wav-master\Lip2Wav-master\synthesizer\feeder.py", line 172, in getitem mel = np.load(os.path.join(os.path.dirname(img_name), 'mels.npz'))['spec'].T File "C:\Users\admin\Documents\temp\lib\site-packages\numpy\lib\npyio.py", line 416, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: 'Dataset/chem//preprocessed/46\mels.npz'

The name of the files are numbers, so the "46" is pointing tawrds the directory where all the images are with the original .wav file. PLEASE HELP! I have been stuck on this issue for a week now:(

prajwalkr commented 3 years ago

Dataset/chem//preprocessed/46\mels.npz

The slashes are in the wrong direction for Windows.

noamkatz commented 3 years ago

Hi! thank you so much for your response! Unfortunatley, that was not the problem. I changed the slashes and could not make it to work. Furthermore, upuntil this point, I did not run into any issue with the slashes while infering the network. What I seem to be missing is the point in the code that CREATES the .npz files - they are not being created at all from merely running the train.py commend, so the issue should be somewhere in the creating the .npz and later on the .npy, right? Thanks for helping me out!

prajwalkr commented 3 years ago

Have you preprocessed your data? https://github.com/Rudrabha/Lip2Wav#preprocessing-the-dataset

noamkatz commented 3 years ago

sure, I get one directory named "preprocessed" with all the images in JPG and one .wav file made "audio", all ordered in directories for 30 seconds segments of the original video. If I run infering for that, there is no problem, I believed that this means that I did the preprocessing stage correctly. Am I suppose to get an .npz or .npy from preprocess.py?

prajwalkr commented 3 years ago

Am I suppose to get an .npz or .npy from preprocess.py?

Yes.

noamkatz commented 3 years ago

Got it! works perfectlly, thanks!