MarekKowalski / DeepAlignmentNetwork

A deep neural network for face alignment
MIT License
510 stars 136 forks source link

how to train the next stages? #23

Open jiangtaoo2333 opened 6 years ago

jiangtaoo2333 commented 6 years ago

I apply this method to our own dataset and work well. thanks for your work and code. but only trained one stage,i wanna train more stages, could u give me some helps. my method is that: first of all, training = FaceAlignmenTraing(1,[0]). then ,traning = FaceAlignmentTraing(2,[1]), and traing.loadNetwork('../my.npz')

it shows error that parameters dont match.

MarekKowalski commented 6 years ago

If all it says is: "Loading warning: different network shape, trying to do something useful" then everything is fine. If that's not the message you are getting then please paste the error you are getting.

Marek

stone226 commented 6 years ago

thanks for your work! I am not familiar with thenao,I want to know that when i train the second stage,traing.loadNetwork initialize the first model,if the parameters of first model will continue to learn.

MarekKowalski commented 6 years ago

Hi, Whether the parameters of the first layer learn is specified by the stagesToTrain parameter of the FaceAlignmentTraining class. If you want the first and the second stages to learn you have to specify stagesToTrain=[0,1].

Thanks,

Marek

zhaobr340104 commented 5 years ago

@MarekKowalski Hi, if I want train a two-stage model, should i use facealignmenttrain(2,[0]), then(2,[1]) is it true?

MarekKowalski commented 5 years ago

Hi,

You should use FaceAlignmenTraing(1,[0]) for first stage and FaceAlignmenTraing(2,[1]) for second stage.

Thanks,

Marek

zhaobr340104 commented 5 years ago

@MarekKowalski Thanks for your answer! I just thought the message"...try to do something useful" is a warning message, so I stop the script quickly. And is it a difference between (2,[0]) and (1,[0])? i trained a model using(2,[0]) and (2,[1]), the error is a little bigger than one in you paper.

MarekKowalski commented 5 years ago

With (2,[0]) the loss is calculated on the second stage, while only the first stage weights are updated, that does not seem like a good idea, since the second stage has random weights which probably mess with the learning. With (1, [0]) the loss is calculated on the first stage and the first stage weights are being updated, which makes more sense.

The above is probably one of the reasons why you don't get a low error. Another reason might be that I used early stopping for the first stage. Unfortunately I am not sure about the number of iterations I used before stopping.