some questions for training model

udibr / headlines

Automatically generate headlines to short articles

MIT License

526 stars 150 forks source link

some questions for training model #5

Open UB1010 opened 8 years ago

UB1010 commented 8 years ago

Hi, udibr, I was running your code with the data of reuters news, the vocabulary-embedding.ipynb is OK for everything. train.ipynb also is OK, but with a warning, and too slow to train the model. I have some questions: 1, How to read the file train.history.pkl? i want to check the history of trainning. 2, I was running predict.ipynb with the model only training 5 iteration, but find a wrong message: "failed to find layer timedistributed_1 in model " when load the model. Why got the wrong message? 3, I find the predict result is not good with the model by 5 iteration. How many iteration that is fine for the result is just OK? I have more than 3K news for training model in CPU, 500 iteration is too slow.

Thank you.

udibr commented 8 years ago

just check the code which wrote the history (last cell of train notebook)
This message is OK, I decided not to load the entire model but just the part below the top dense layer, so this is why you see this message. However the weights are returned from the load function and I latter use them directly.
I think 3K examples is not enough. Try to get at least two order of magnitude more data. I guess its time to move to using GPU...

hipoglucido commented 8 years ago

Hi @udibr , I am also running the train notebook and I have one question regarding the line:

if FN1:
    model.load_weights('data/train.hdf5')

If its the first time that I am about to train the model, should model.load_weights('data/train.hdf5') be executed? If that's the case, when is that weights file generated? That path doesn't find the file on my machine.

Thanks a lot

udibr commented 8 years ago

there was a small bug in code which I now updated. In any case on first run FN1 should be None and no load should be made

UB1010 commented 8 years ago

Thank you for your response. I want to run the code by Chinese text, can you guide something? Get a new Word Embedding by Chinese, must use GloVe? Can I use another model? Other matters?