karamarieliu / gst_tacotron2_wavenet

13 stars 3 forks source link

local variable 'checkpoint' referenced before assignment #2

Open camjac251 opened 4 years ago

camjac251 commented 4 years ago

I was training with this using the LJSpeech dataset and default train variables when it reached 100k steps and said

Loading checkpoint: logs-Tacotron-2\taco_pretrained/tacotron_model.ckpt-100000
Loaded metadata for 13100 examples (24.06 hours)
starting synthesis
  0%|▏                                                                            | 29/13100 [00:38<4:45:34,  1.31s/it]
Generated 32 train batches of size 48 in 62.968 sec
  0%|▏                                                                            | 30/13100 [00:43<9:06:18,  2.51s/it]Exception in thread background:
Traceback (most recent call last):
  File "C:\Users\camja\Anaconda3\envs\taco\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "C:\Users\camja\Anaconda3\envs\taco\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\camja\Desktop\gst_tacotron2_wavenet\tacotron\feeder.py", line 166, in _enqueue_next_train_group
    self._session.run(self._enqueue_op, feed_dict=feed_dict)
  File "C:\Users\camja\Anaconda3\envs\taco\lib\site-packages\tensorflow\python\client\session.py", line 895, in run
    run_metadata_ptr)
  File "C:\Users\camja\Anaconda3\envs\taco\lib\site-packages\tensorflow\python\client\session.py", line 1053, in _run
    raise RuntimeError('Attempted to use a closed Session.')
RuntimeError: Attempted to use a closed Session.

However, after it was showing this

 67%|████████████████████████████████████████████████▉                        | 8787/13100 [2:31:59<1:06:18,  1.08it/s]T

but I had assumed that it might've just been loading up the files again, which I was wrong about. It looks like it was generating the gta files in tacotron_output but I foolishly ctrl c'd out of that thinking it was something else.

Is it possible to get back into my 100k model, or must I start over?

If I try to resume, I get this

python train.py --model='Tacotron-2' --restore=True --tacotron_train_steps=250000 --wavenet_train_steps=250000
Using TensorFlow backend.

#############################################################

Tacotron GTA Synthesis

###########################################################

Traceback (most recent call last):
  File "train.py", line 127, in <module>
    main()
  File "train.py", line 121, in main
    train(args, log_dir, hparams)
  File "train.py", line 64, in train
    input_path = tacotron_synthesize(args, hparams, checkpoint)
UnboundLocalError: local variable 'checkpoint' referenced before assignment
camjac251 commented 4 years ago

I had a previous backup state of 79,000 iterations and was able to successfully load that. I'll be sure to not cancel it this time around.

Is it possible to gracefully stop training?