Saver not working - Githubissues

jiegzhan / multi-class-text-classification-cnn-rnn

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

https://www.kaggle.com/c/sf-crime/data

Apache License 2.0

599 stars 262 forks source link

Saver not working #20

Closed hkhatod closed 7 years ago

hkhatod commented 7 years ago

I believe the file train.py and line#151 : os.rename(path, trained_dir + 'best_model.ckpt') needs to be updated for 1.2. The path variable is missing the extension of the file ? not sure. Is the any other way to fix it ?

AND

predict.py line 109,110, and 111 needs to be updated as well.

checkpoint_file = trained_dir + 'best_model.ckpt' saver = tf.train.Saver(tf.all_variables()) saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file[:-5]))`

hkhatod commented 7 years ago

Here is the error message I get on the train.py file

Traceback (most recent call last): File "train.py", line 161, in train_cnn_rnn() File "train.py", line 151, in train_cnn_rnn os.rename(path, trained_dir + 'best_model.ckpt') FileNotFoundError: [Errno 2] No such file or directory: './checkpoints_1499389616/model-2600' -> './trained_results_1499389616/best_model.ckpt'

dpinthinker commented 7 years ago

i got same problem, too

abbail commented 7 years ago

Same issue here, anyone have any fresh ideas? In another issue mentioned below it was suggested to change the saver to use the V1 format, but that didn't seem to help me:

saver = tf.train.Saver(tf.all_variables(), write_version=tf.train.SaverDef.V1)

hkhatod commented 7 years ago

I got it to work. You need to do two things: Step 1. Update the files to newer version using these instructions: https://www.tensorflow.org/install/migration

Step 2: I used the "last_checkpoints" restore method as described here: https://www.tensorflow.org/api_docs/python/tf/train/Saver

That seem to have done the trick for me.

abbail commented 7 years ago

Mind throwing up the source somewhere after you made these changes? tf_upgrade.py isn't being kind to me: ImportError: No module named 'tensorflow.tools'

Seems to be an issue like this for me for why I can't use the upgrade myself: ImportError: No module named 'tensorflow.tools'

hkhatod commented 7 years ago

Try download the files from this repo locally and place them in the same folder as your files .

That should work hopfully
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/compatibility

hkhatod commented 7 years ago

Forked.

Here is the updated repo that works with 1.2 https://github.com/hkhatod/multi-class-text-classification-cnn-rnn.

One caveat. Right now I have hardcoded line 111 in predict.py:

        checkpoint_file = trained_dir + 'model-2200.meta'

Just use the last model name from checkpoint file that you will generate after you are done with training.

abbail commented 7 years ago

Thanks, it was still very picky about the training sample size I used, but I got it working with your code. Helped a ton!

dpinthinker commented 7 years ago

@hkhatod Thanks, it helps a lot. But it still has some problem.

dpinthinker commented 7 years ago

@hkhatod I have just fixed two problem in training parameters saving in train.py and model loading in predict.py. This is the pull request: https://github.com/hkhatod/multi-class-text-classification-cnn-rnn/pull/1

my repository: https://github.com/HarryHa/multi-class-text-classification-cnn-rnn

saja1994 commented 6 years ago

I have the same problem when apply I use my dataset!! please, have any one solved this issue?