zylo117 / Yet-Another-EfficientDet-Pytorch

The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
GNU Lesser General Public License v3.0
5.2k stars 1.27k forks source link

Question : how many epochs to get an ok result? #64

Open pfeatherstone opened 4 years ago

pfeatherstone commented 4 years ago

How many epochs are required to get an ok result? I'm finding it takes the age of the universe to train efficientdet0.

pfeatherstone commented 4 years ago

Could we improve convergence using ciou loss ?

zylo117 commented 4 years ago

it does take ages, and no trick but to buy more gpus

pfeatherstone commented 4 years ago

I've tried ciou loss. It does seem to help. But I need to tune some loss gains, i.e. loss = a * cls_loss + b * iou_loss I need to tune a and b

zylo117 commented 4 years ago

please pull the latest code, the former loss function is bugged

pfeatherstone commented 4 years ago

Just noticed. Thanks

DecentMakeover commented 4 years ago

@pfeatherstone my regression loss reduces far more quickly than my classification loss, how could i tune the ciou loss to balance that ? loss = a * cls_loss + b * iou_loss

Thanks

DecentMakeover commented 4 years ago

'mean' to 'sum' , did it help ?

pfeatherstone commented 4 years ago

For starters, I switched all the loss reductions from ‘mean’ to ‘sum’. From memory, that will scale your cls loss a lot more

pfeatherstone commented 4 years ago

'mean' to 'sum' , did it help ? Well it will scale your cls loss a lot more.

DecentMakeover commented 4 years ago

alright, thanks ill try that.

for now ill skip the ciou loss, i am not able to understand it very well. Thanks for the inputs.

pfeatherstone commented 4 years ago

To be honest, I got bored of messing around with it. It works and It’s got at least 100 epochs to converge. You can also mess around with optimisers and schedulers . There are 100s of ways you can spend fine tuning stuff.

pfeatherstone commented 4 years ago

I pinched the ciou loss code from the ultralytics yolov3 repo. It’s very good.

DecentMakeover commented 4 years ago

thanks ill check, i wanted to know your thoughts on number of epochs to train, is 500 epochs an overkill for a dataset of about 1200 images ?

pfeatherstone commented 4 years ago

The architecture of efficientdet is very deep. It has so many connections. It could be why it takes longer to train. The model may be small but training-wise there are a lot of activations held in memory. I can only use a batchsize of 8 on my GPU. That also doesn’t help. I don’t have a fancy rig.

pfeatherstone commented 4 years ago

In my opinion if you need 500 epochs to converge then it’s no good

pfeatherstone commented 4 years ago

It’s likely going to be overfitted

DecentMakeover commented 4 years ago

oh

DecentMakeover commented 4 years ago

yeah even i thought it was an overkill 100-150 seems more reasonable

pfeatherstone commented 4 years ago

Hence why I was messing around with other losses to help with convergence. The less epochs required the better.

pfeatherstone commented 4 years ago

Yep I agree. I never go past 100 epochs

DecentMakeover commented 4 years ago

Hmmm, alright let me see

pfeatherstone commented 4 years ago

You could mess around with higher learning rates using SGD. I would recommend reading the papers on super-convergence using OneCycle policy

DecentMakeover commented 4 years ago

sure, ill check those out

pfeatherstone commented 4 years ago

i’m waiting for the day some academic spews out an optimal black box for training nets

DecentMakeover commented 4 years ago

ill join that club