Closed Cherry2410 closed 6 years ago
Hello. It should not turn into Nan if you are running the hyperparameters assigned in the training file. Can you tell us more about what is the value of MSE? Have you created the training data from the provided script?
A small thing to consider is using Arch4 from the architectures we have designed. That is the architecture we used for our released files.
@A2Zadeh, here are some training results. Landmark 8 Train on 1800225 samples, validate on 200070 samples Epoch 1/100
Can you give me some suggestions? thx.
From the results, it‘s overfitting. And the dataset of afw, Helen, ibug, lfpw, 300w are used.
Thanks @LingQiu. Is this on architecture 4?
@A2Zadeh, we tried the model of arch4 and model_half. Perhaps, every dataset should have its own characteristics.
@LingQiu if you are training on our data you may get nan values but you will recover from it if you continue the training. We basically do MSE optimization and use corr as a measure of visualization rather than direct optimization. Are you able to continue your training and see if corr recovers?
We have been training. The value of MSE is better. Although, the value of the corr can recover from nan, its value is very small. Is this normal?
@LingQiu depending on the landmark number, yes quite possible. Anything higher than 0 is good for some very hard landmarks such as markers around the face. They are hard to detect and disambiguate.
ok,thx.@A2Zadeh
hello,@LingQiu @A2Zadeh In my process of training cen, I also encountered the corr value of nan. My database is different from the author's (all near infrared images). My training parameters are: num_epochs 100 (generally more than 20 corr value is nan), minibatch_size is 512 and using arch4 architecture, how can I adjust the parameters, Can you give Any suggestions? When num_epochs increased to 200, the corr value recovered at about 120 times, but then increased to the end.
Hi, In the training of CEN, the value of the Corr turns into nan. The datasets of Helen and LFPW are used. I do not understand this phenomenon. Can you give some advice?