baomingwang / MTCNN-Tensorflow

mtcnn-tf
MIT License
162 stars 96 forks source link

Why not make a sum loss of (cls, box, pts) ? #2

Open lzwhard opened 6 years ago

lzwhard commented 6 years ago

Hi wangbm,

Thank you for your good work, it does me a favor! I have a question as title, because I find that you train the three losses in a random mode.

Thanks again.

baomingwang commented 6 years ago

Sorry for replying late. The author gets classification and regression data, namely positive, negative, part data, according to different IOU only from widerface, which has no annotation for landmarks. So, if we put these data into network, there is no localization loss. Similarly, positive and negative data are used for classification loss, while positive and part data for bounding box regression loss. Therefore, one negative sample would only result in classification loss, and a part sample only results in detection loss. Although, there exists one exception that positive data occupy with both classification and detection loss. You can try using one positive data to generate both loss. But considering all the training data are cropped randomly, applying different sets of positive data for classification and regression is preferable. As a result, each training data only serves one loss, which seems to 'train the three losses in a random mode'.

Zepyhrus commented 5 years ago

Hi there, I got the configuration of random training process: train_Onet(training_data=training_data, base_lr=0.0001, loss_weight=[1.0, 0.5, 1.0], train_mode=2, num_epochs=[100, None, None], load_model=False, load_filename=load_filename, save_model=True, save_filename=save_filename, num_iter_to_save=50000, device=device, gpu_memory_fraction=0.6)

The train_mode=2 refers no training over the pts, is this designed or I missed the pts training part? Do I have to modify the train_mode during the R and O net training process?

17759205390 commented 5 years ago

Hi there, I got the configuration of random training process: train_Onet(training_data=training_data, base_lr=0.0001, loss_weight=[1.0, 0.5, 1.0], train_mode=2, num_epochs=[100, None, None], load_model=False, load_filename=load_filename, save_model=True, save_filename=save_filename, num_iter_to_save=50000, device=device, gpu_memory_fraction=0.6)

The train_mode=2 refers no training over the pts, is this designed or I missed the pts training part? Do I have to modify the train_mode during the R and O net training process?

this version no need landmark data,so does the train part, I think and I want ask what is the mean by num_epochs=[100, None, None],

xqjiang423 commented 3 years ago

The paper also controlled the loss based on the type of each sample, which is beta in the formula. It is correct to not simply make a sum of (cls, box, pts).