Closed dribnet closed 7 years ago
Ok, I just put several trained models in my public directory. You can wget them with
wget http://caligari.dartmouth.edu/~sgreydan/scribe/models/$FILENAME
where filename could be any of: [checkpoint model.ckpt-129000 model.ckpt-129000.meta model.ckpt-40000 model.ckpt-40000.meta model.ckpt-66500 model.ckpt-66500.meta]
Make sure you set your model_checkpoint_path: “model.ckpt-XXXX” where ‘XXXX’ is the checkpoint you want to load.
I can’t remember which of these trained models is the best. Also, the "You know nothing Jon Snow" example was generated before I made some significant edits to the code (renaming scopes, etc.) so that trained model won't work. However, you should get fairly legible results with some of these.
Thanks for uploading this. I was able to download all the files successfully, and updated model_checkpoint_path
in the checkpoint
file as suggested. However, I get errors when loading the checkpoints:
attempt to load saved model...
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "cell1/LSTMCell/W_0" not found in checkpoint files fetched/model.ckpt-66500
[[Node: save/RestoreV2_12 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_12/tensor_names, save/RestoreV2_12/shape_and_slices)]]
no saved model to load. starting new session
My guess is some subtle difference in the code or tensorflow from your version. But this is no longer an issue for me as I'm now able to train my own model and sample it with reasonable results as shown. So feel free to close for now or I'm happy to try other checkpoints if you'd like to debug further.
@dribnet I'm glad it's working. Again, I'm interested to see what you come up with. I'm currently experimenting with my dnc implementation (https://github.com/greydanus/dnc) to see if I can use it to generate handwriting.
It seems that It can't work on tensorflow 1.0.0. @greydanus
It does now @mxzhao ; I just merged
@dribnet Can I have the model that you most recently trained? We know for sure that it works with the merged version. Just put it in your public folder and I can wget
it - thanks!
Sure - I now have have both a TF 1.0 and 0.2 model trained locally. Let me dig up the 1.0 and verify it is working and I can try to make it available.
I found a model that works, but in the process discovered that I had pushed a handful of breaking changes to the branch that got merged in. Pull request #4 gets master back to a stable state and also has details on how to download and use the pre-trained model.
Sam, I've been trying to replicate your results after reading through your tutorial.
I can't seem to load the checkpoint file. I'm using python3 and tf 1.0 btw, and converted your files using 2to3 which does not seem to be the source of any error.
I downloaded the checkpoint files you provided:
Scotts-MBP:scribe slcott$
Scotts-MBP:scribe slcott$
Scotts-MBP:scribe slcott$ ls models
checkpoint model.ckpt-129000 train_scribe.txt
checkpoint.BACKUP model.ckpt-129000.meta
Scotts-MBP:scribe slcott$
Scotts-MBP:scribe slcott$ more models/checkpoint
model_checkpoint_path: "model.ckpt-129000"
all_model_checkpoint_paths: "model.ckpt-129000"
Scotts-MBP:scribe slcott$
Scotts-MBP:scribe slcott$
.restore()
is the initial source of the error.
self.saver.restore(self.sess, './models/model.ckpt-129000')
except Exception as e:
print(e)
self.logger.write("no saved model to load. starting new session")
load_was_success = False
Do you see what might be the root cause of my error?
Tensor name "cell0/lstm_cell/biases" not found in checkpoint files ./models/model.ckpt-129000
[[Node: save/RestoreV2_3 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_3/tensor_names, save/RestoreV2_3/shape_and_slices)]]
Caused by op 'save/RestoreV2_3', defined at:
File "run.py", line 164, in <module>
main()
File "run.py", line 65, in main
train_model(args) if args.train else sample_model(args)
File "run.py", line 136, in sample_model
model = Model(args, logger)
File "/Users/slcott/Desktop/scribe/model.py", line 213, in __init__
self.saver = tf.train.Saver(tf.global_variables())
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1051, in __init__
self.build()
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1081, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 675, in build
restore_sequentially, reshape)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 402, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 242, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 668, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
self._traceback = _extract_stack()
NotFoundError (see above for traceback): Tensor name "cell0/lstm_cell/biases" not found in checkpoint files ./models/model.ckpt-129000
[[Node: save/RestoreV2_3 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_3/tensor_names, save/RestoreV2_3/shape_and_slices)]]
no saved model to load. starting new session
load failed, sampling canceled
^CTraceback (most recent call last):
File "run.py", line 164, in <module>
main()
File "run.py", line 65, in main
train_model(args) if args.train else sample_model(args)
File "run.py", line 160, in sample_model
time.sleep(args.sleep_time)
KeyboardInterrupt
Scotts-MBP:scribe slcott$
Thanks for the trained TF 1.0 model @dribnet. @slcott I just added a "Getting started" section to the README which includes instructions for downloading a pretrained model.
@mxzhao latest commit is now fully compatible with tf 1.0 and includes a pretrained model
Looking for the pretrained model as the sampling notebook indicates I can "
Look on this project's Github page for instructions on how to download a pretrained model.
" Is this available somewhere?I'd like to try to replicate your results on sampling. I've trained my own model, but my results thus far are lackluster:
I'm guessing that I might need to train with different hyperparamaters (I used the
run.py
defaults), and having a model that worked would help me figure out what's going on.