Open pfeatherstone opened 4 years ago
Could we improve convergence using ciou loss ?
it does take ages, and no trick but to buy more gpus
I've tried ciou loss. It does seem to help. But I need to tune some loss gains, i.e.
loss = a * cls_loss + b * iou_loss
I need to tune a and b
please pull the latest code, the former loss function is bugged
Just noticed. Thanks
@pfeatherstone my regression loss reduces far more quickly than my classification loss, how could i tune the ciou loss to balance that ? loss = a * cls_loss + b * iou_loss
Thanks
'mean' to 'sum' , did it help ?
For starters, I switched all the loss reductions from ‘mean’ to ‘sum’. From memory, that will scale your cls loss a lot more
'mean' to 'sum' , did it help ? Well it will scale your cls loss a lot more.
alright, thanks ill try that.
for now ill skip the ciou loss, i am not able to understand it very well. Thanks for the inputs.
To be honest, I got bored of messing around with it. It works and It’s got at least 100 epochs to converge. You can also mess around with optimisers and schedulers . There are 100s of ways you can spend fine tuning stuff.
I pinched the ciou loss code from the ultralytics yolov3 repo. It’s very good.
thanks ill check, i wanted to know your thoughts on number of epochs to train, is 500 epochs an overkill for a dataset of about 1200 images ?
The architecture of efficientdet is very deep. It has so many connections. It could be why it takes longer to train. The model may be small but training-wise there are a lot of activations held in memory. I can only use a batchsize of 8 on my GPU. That also doesn’t help. I don’t have a fancy rig.
In my opinion if you need 500 epochs to converge then it’s no good
It’s likely going to be overfitted
oh
yeah even i thought it was an overkill 100-150 seems more reasonable
Hence why I was messing around with other losses to help with convergence. The less epochs required the better.
Yep I agree. I never go past 100 epochs
Hmmm, alright let me see
You could mess around with higher learning rates using SGD. I would recommend reading the papers on super-convergence using OneCycle policy
sure, ill check those out
i’m waiting for the day some academic spews out an optimal black box for training nets
ill join that club
How many epochs are required to get an ok result? I'm finding it takes the age of the universe to train efficientdet0.