Cannot restore pretrained model

sanjass commented 6 years ago

Thank you for the great work!

I'm trying to use the pretrained model to decode my own preprocessed data (I do not need to evaluate it). However, I'm getting this exception when running main.py:

INFO:tensorflow:Loading checkpoint ./end2end/train/bestmodel-51000 INFO:tensorflow:Restoring parameters from ./end2end/train/bestmodel-51000 INFO:tensorflow:Failed to load checkpoint from ./end2end/train. Sleeping for 10 secs...

I figured this happens in util.py in load_ckpt(), when it gets to saver.restore(sess, "./end2end/train/bestmodel-51000")

I'm running main.py in the following way: python2 main.py --mode eval --model end2end --vocab_path ./data/finished_files/vocab --data_path ./data/pgn/output/finished_files/chunked/test_* --decode_method greedy --eval_method loss --log_root . --pretrained_selector_path ./end2end/train/extractor_model --pretrained_rewriter_path ./end2end/train/abstracter_model --single_pass 1

So, my parameters are: {'adagrad_init_acc': 0.1, 'batch_size': 16, 'beam_size': 4, 'convert_to_coverage_model': False, 'cov_loss_wt': 1.0, 'coverage': False, 'data_path': './data/pgn/output/finished_files/chunked/test_000.bin', 'decode_method': 'greedy', 'emb_dim': 128, 'eval_ckpt_path': '', 'eval_gt_rouge': False, 'eval_method': 'loss', 'exp_name': '', 'hidden_dim_rewriter': 256, 'hidden_dim_selector': 200, 'inconsistent_loss': True, 'inconsistent_topk': 3, 'load_best_eval_model': False, 'log_root': '.', 'lr': 0.15, 'max_art_len': 50, 'max_dec_steps': 100, 'max_enc_steps': 600, 'max_grad_norm': 2.0, 'max_select_sent': 20, 'max_sent_len': 50, 'max_train_iter': 10000, 'min_dec_steps': 35, 'min_select_sent': 5, 'mode': 'eval', 'model': 'end2end', 'model_max_to_keep': 5, 'pretrained_rewriter_path': './end2end/train/abstracter_model', 'pretrained_selector_path': './end2end/train/extractor_model', 'rand_unif_init_mag': 0.02, 'save_model_every': 1000, 'save_pkl': False, 'save_vis': False, 'select_method': 'prob', 'selector_loss_wt': 5.0, 'single_pass': True, 'start_eval_rouge': 30000, 'thres': 0.4, 'trunc_norm_init_std': 0.0001, 'vocab_path': './data/finished_files/vocab', 'vocab_size': 50000}

I'm using Tensowflow 1.1.0 and python 2.7.

Is this the right way to configure the settings in order to decode my own documents?
How to solve the exception issue? Any help is appreciated, thank you!

HsuWanTing commented 6 years ago

Hi,

You should use the "evalall" mode for decoding. "eval" mode will do the evaluation during training. So it will always find the latest trained model. "Evalall" mode will load the pretrained model if you set the correct path. If you want to get decoded results of the pretrained unified model, you should use "evalall" mode and download and set the "eval_ckpt_path" to our pretrained unified model.

sanjass commented 6 years ago

Thank you for the prompt response! I followed your suggestion but unfortunately I'm still getting the same exception. I'm running main.py as follows:

python2 main.py --mode evalall --model end2end --vocab_path ./data/finished_files/vocab --data_path ./data/pgn/output/finished_files/chunked/test_* --decode_method greedy --eval_method loss --single_pass 1 --eval_ckpt_path ./end2end/train/unified_model/bestmodel-51000 and again I'm getting the following: INFO:tensorflow:Loading checkpoint ./end2end/train/unified_model/bestmodel-51000 INFO:tensorflow:Restoring parameters from ./end2end/train/unified_model/bestmodel-51000 INFO:tensorflow:Failed to load checkpoint from end2end/train. Sleeping for 10 secs... Any idea why this might be happening? Thanks!

ZhikunWei commented 5 years ago

Did HsuWanTing forget to give the checkpoint file? I counter this problem too. I believe there should be a checkpoint file along with the model files, which the author didn't upload for us

cuthbertjohnkarawa commented 4 years ago

Did HsuWanTing forget to give the checkpoint file? I counter this problem too. I believe there should be a checkpoint file along with the model files, which the author didn't upload for us

true

HsuWanTing / unified-summarization

Cannot restore pretrained model #3