the loss converge so slow

balancap / SSD-Tensorflow

Single Shot MultiBox Detector in TensorFlow

4.11k stars 1.89k forks source link

the loss converge so slow #127

Open seasonyang opened 7 years ago

seasonyang commented 7 years ago

I am training my own data based on pre-trained checkpoint vgg16.chkp. My dataset is 1500 pictures，two classes contain 0：background and 1：myObject，my batch=50 why loss converge so slow？！

`INFO:tensorflow:Recording summary at step 266003.

INFO:tensorflow:global step 266010: loss = 16.8371 (1.140 sec/step)

INFO:tensorflow:global step 266020: loss = 16.5885 (1.081 sec/step)

INFO:tensorflow:global step 266030: loss = 18.6807 (1.163 sec/step)

INFO:tensorflow:global step 266040: loss = 18.6046 (1.085 sec/step)

INFO:tensorflow:global step 266050: loss = 18.5053 (1.079 sec/step)`

oowe commented 7 years ago

Anybody know how to make converge quickly?

oowe commented 7 years ago

I want to know please.

seasonyang commented 7 years ago

if your own data's Characteristic not obvious? When I train the data which Characteristic is obvious it converge quickly.

And then, you can modify your batch when you training.

It would not the correct answer， but effective.

congjianting commented 7 years ago

@seasonyang i face the same problem, the training loss is very large, did you know why and tell me how to fix it?

CODEJY commented 7 years ago

Had you solved the problem?Can you teach me how to change this model to just output two class?

RoseLii commented 7 years ago

I face the same problem. INFO:tensorflow:global step 194010: loss = 3.4237 (0.767 sec/step) INFO:tensorflow:global step 194020: loss = 2.6979 (0.746 sec/step) INFO:tensorflow:global step 194030: loss = 2.4754 (0.759 sec/step) INFO:tensorflow:global step 194040: loss = 5.0463 (0.757 sec/step) INFO:tensorflow:Recording summary at step 194042. INFO:tensorflow:global step 194050: loss = 5.1746 (0.729 sec/step) INFO:tensorflow:global step 194060: loss = 1.6664 (0.775 sec/step) INFO:tensorflow:global step 194070: loss = 3.5295 (0.766 sec/step) INFO:tensorflow:global step 194080: loss = 2.6343 (0.764 sec/step) INFO:tensorflow:global step 194090: loss = 3.4903 (0.766 sec/step) How to solve this problem.

Janezzliu commented 6 years ago

I think you guys get good results.Many others' losses are around 40,however,they gain high mAP in evaluation.So I think it doesn't matter whether loss is high when we get good mAP in evaluation step.

davinca commented 6 years ago

@RoseLii my loss is down to around 25 after training 100k steps.your loss is very low, do you fix the code? The matching strategy is different from original paper,but i do not know how to fix it.

qianweilzh commented 3 years ago

I think you guys get good results.Many others' losses are around 40,however,they gain high mAP in evaluation.So I think it doesn't matter whether loss is high when we get good mAP in evaluation step.

@Janezzliu Here comes the question, how to improve the mAP?