rpryzant / delete_retrieve_generate

PyTorch implementation of the Delete, Retrieve Generate style transfer algorithm
MIT License
132 stars 26 forks source link

I replace the ‘delete‘ with ’seq2seq‘ in the config yelp_config.json, and then I meet RuntimeError: Error(s) in loading state_dict for SeqModel, can you help me? #32

Closed wasedaward closed 3 years ago

wasedaward commented 3 years ago
@exp16:~/delete_retrieve_generate$ python3 train.py --config yelp_config.json --bleu
2021-06-10 09:58:27,484 - INFO - Reading data ...
2021-06-10 09:58:45,074 - INFO - ...done!
/home/Ren/anaconda3/envs/drg/lib/python3.6/site-packages/torch/nn/modules/rnn.py:54: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
  "num_layers={}".format(dropout, num_layers))
2021-06-10 09:58:46,409 - INFO - MODEL HAS 9050117 params
Traceback (most recent call last):
  File "train.py", line 114, in <module>
    checkpoint_dir=working_dir)
  File "/home/Ren/delete_retrieve_generate/src/models.py", line 38, in attempt_load_model
    model.load_state_dict(torch.load(checkpoint_path))
  File "/home/Ren/anaconda3/envs/drg/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SeqModel:
        Unexpected key(s) in state_dict: "attribute_embedding.weight". 
        size mismatch for c_bridge.weight: copying a param with shape torch.Size([512, 640]) from checkpoint, the shape in current model is torch.Size([512, 512]).
        size mismatch for h_bridge.weight: copying a param with shape torch.Size([512, 640]) from checkpoint, the shape in current model is torch.Size([512, 512]).
wasedaward commented 3 years ago

Here is my yelp_config.json.

{
  "training": {
    "optimizer": "adam",
    "learning_rate": 0.0003,
    "max_norm": 3.0,
    "epochs": 70,
    "batches_per_report": 200,
    "batches_per_sampling": 500,
    "random_seed": 1
  },
  "data": {
    "src": "data/yelp/sentiment.train.0",
    "tgt": "data/yelp/sentiment.train.1",
    "src_test": "data/yelp/reference.test.0",
    "tgt_test": "data/yelp/reference.test.1",
    "src_vocab": "data/yelp/vocab",
    "tgt_vocab": "data/yelp/vocab",
    "share_vocab": true,
    "attribute_vocab": "data/yelp/ngram.15.attribute",
    "ngram_attributes": true,
    "batch_size": 256,
    "max_len": 50,
    "working_dir": "working_dir"
  },
    "model": {
        "model_type": "seq2seq",
        "emb_dim": 128,
        "attention": false,
        "encoder": "lstm",
        "src_hidden_dim": 512,
        "src_layers": 1,
        "bidirectional": true,
        "tgt_hidden_dim": 512,
        "tgt_layers": 1,
        "decode": "greedy",
        "dropout": 0.2
    }
}
wasedaward commented 3 years ago

This is a little bit same as issue #29 ,so maybe I didn't solve that problem totally.

rpryzant commented 3 years ago

Hmm I was unable to reproduce this error using your config. How is #27 related? Perhaps your working directory (working_dir) already has saved models with a different configuration?

wasedaward commented 3 years ago

Hmm I was unable to reproduce this error using your config. How is #27 related? Perhaps your working directory (working_dir) already has saved models with a different configuration?

You are totally right!, after I deleted the previous configuration's models, it's okay now! Thank you very much!