for a good result - Githubissues

mjanddy commented 4 years ago

Hi,Would you mind telling me how your model is trained? I didn't use the code to achieve your model effect.

reshow commented 4 years ago

If you run the code directly and correctly, the result would be slightly worse than mine (2d landmark nme is about 3.30±0.03 ) since the number of parameters is less than in PRN's paper. To achieve a good performance, I employ a number of data augmentation methods which are not the same as PRN, such as random erasing, gauss blur, etc. These methods are arbitrary so that I remove them from my code. Another way is to increase the parameter number of the network. Here I use exactly the same network structure as PRN's given model. The model size is 52MB while the model size in their paper is more than 150MB. I'm not sure about this part.

mjanddy commented 4 years ago

I train your 30 epoch ,but just got 2d nme 3.8

reshow commented 4 years ago

How about the NME on training data?

mjanddy commented 4 years ago

I do not test for training data

reshow commented 4 years ago

Sorry, I mean the printed 'metrics0' of training dataset and evaluation dataset

mjanddy commented 4 years ago

I'm sorry I didn't record it

mjanddy commented 4 years ago

I use the datasets for official generation method, not use your method。Does this affect the effect?

mjanddy commented 4 years ago

I reload the model ,and got this result

[epoch:0, iter:111/7653, time:51] Loss: 0.1049 Metrics0: 0.0379

reshow commented 4 years ago

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.

The metrics0 should reach 0.03 in less than 10 epochs.

Try to use my generation code.

And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

mjanddy commented 4 years ago

ok,I will try it. Thanks a lot.

mjanddy commented 4 years ago

I do it follwing your all code,but its effect is still not good，

this is result:

[epoch:29, iter:7654/7653, time:1802] Loss: 0.0329 Metrics0: 0.0130

nme2d 0.04015569452557179 nme3d 0.054406630244023056 landmark2d 0.043106316771823916 landmark3d 0.05833802395872772

Look forward to your reply.

reshow commented 4 years ago

The result on training set is good and better than mine. But the evaluation result is bad. I guess this is because I remove some augmentation codes. ~~Please give me an email and I'll send them to you.~~ I'll update it right now.

reshow commented 4 years ago

I've updated it. Sorry for the trouble.

mjanddy commented 4 years ago

thanks

mjanddy commented 4 years ago

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234 nme3d 0.04689772832815957 . and loss no longer reduced. Is this normal?

reshow commented 4 years ago

I trained it myself again and I get nme3d 0.0445 in 30 epochs. I don't known what causes this difference. You can try to use another learning rate scheduler in the code

self.scheduler = optim.lr_scheduler.StepLR(self.optimizer, step_size=5, gamma=0.5)

and set the learning rate to 2.5e-5.

I use this scheduler long time ago since it takes more epochs.

mjanddy commented 4 years ago

for get nme2d=0.031,how many epochs have you trained?

reshow commented 4 years ago

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

reshow commented 4 years ago

I don't remember, but 45 epochs is enough.

mjanddy commented 4 years ago

I

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

reshow commented 4 years ago

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

mjanddy commented 4 years ago

ok ,I will try it.

mjanddy commented 4 years ago

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04，can't drop to 0.03,Is this normal?

mjanddy commented 4 years ago

And if I use smaller learning rate (lr=8e-6) to train it from the beginning, the nme is drop slower than before(lr=1e-4).

reshow commented 4 years ago

I don't use the RandomColor function in practice, forget it. If you use a smaller learning rate, does it finally reach a good result? And if the speed is unbearable, you may try some strategies such as warm up (I don't really use that).

mjanddy commented 4 years ago

I'm not got a good result for use smaller learning rate or use optim.lr_scheduler.StepLR. The best result is nme2d 0.336.

reshow / PRNet-PyTorch

for a good result #5