Errors in saving trained models

clab / dynet

DyNet: The Dynamic Neural Network Toolkit

Apache License 2.0

3.43k stars 703 forks source link

Errors in saving trained models #1368

Open talbaumel opened 6 years ago

talbaumel commented 6 years ago

Hi, when running

cbow = CBOWClassifier()
trainer = dy.AdagradTrainer(cbow.model) #where mode is a parameter collection
loss = cbow.get_loss(train_set[0], True)
loss_value = loss.value()
loss.backward()
trainer.update()
cbow.model.save("best_tmp.model")

cbow = CBOWClassifier()
cbow.model.populate("best_tmp.model")

The out is unreadable and the load fails. While commenting out the trainer.update() line makes the save/load work

cbow = CBOWClassifier()
trainer = dy.AdagradTrainer(cbow.model)
loss = cbow.get_loss(train_set[0], True)
loss_value = loss.value()
loss.backward()
#trainer.update()
cbow.model.save("best_tmp.model")

cbow = CBOWClassifier()
cbow.model.populate("best_tmp.model")

Using DyNet 2.0.3 trough jupyter notebook There are no error msgs the kernel just disconnect and reconnect back after few minutes(!?)

neubig commented 6 years ago

Thanks for the report! Could you create a self-contained example? I wasn't able to reproduce this immediately.

talbaumel commented 6 years ago

Here is a link to my code+data that can't load the model it saves https://www.dropbox.com/s/46btzulw32nl1ev/save_error.zip?dl=0 Before running you should change the PATH variable in save_error.py/ipynb to the path where the uncompressed files are.

Sorry for sending so much code!!

Thank you!

talbaumel commented 6 years ago

Managed to single out the problem When using this class in the model

class PreTrainedEmbeddings:
    def __init__(self, model, w2varray):
        emb_size = len(w2varray[0])
        vocab_size = len(w2varray)
        self.embeddings = model.add_lookup_parameters((len(w2varray), emb_size))
        self.embeddings.init_from_array(w2varray)
        self.embeddings.set_updated(False)

    def __call__(self, sent):
        return [self.embeddings[word] for word in sent]

The model can't be loaded after being trained and saved

pmichel31415 commented 6 years ago

Hmm that's weird. Can you give us the output of

grep "Parameter" [model_file]

On the model file?

talbaumel commented 6 years ago

Sure, that's it:

#Parameter# /_0 {2,50} 1601 ZERO_GRAD
#Parameter# /_1 {2} 33 ZERO_GRAD
#Parameter# /_5 {50,300} 240001 ZERO_GRAD
#Parameter# /_6 {50} 801 ZERO_GRAD
#Parameter# /_7 {50,100} 80001 ZERO_GRAD
#Parameter# /_8 {50} 801 ZERO_GRAD
#LookupParameter# /_2 {50,18440} 14752001 ZERO_GRAD
#LookupParameter# /_3 {300,18440} 177024002 FULL_GRAD
#LookupParameter# /_4 {50,18440} 14752001 ZERO_GRAD

pmichel31415 commented 6 years ago

OK I think I might know what the problem is, can you try to change your code to:

class PreTrainedEmbeddings:
    def __init__(self, model, w2varray):
        emb_size = len(w2varray[0])
        vocab_size = len(w2varray)
        self.embeddings = model.add_lookup_parameters((len(w2varray), emb_size))
        self.embeddings.init_from_array(w2varray)

    def __call__(self, sent):
        return [dy.nobackprop(self.embeddings[word]) for word in sent]

ie remove the set_updated line and add nobackprop in __call__ instead. I think the problem is with storing gradients (in which case it's a bug).

talbaumel commented 6 years ago

🎉 working! I guess it's a bug :/