koyeongmin / PINet_new

MIT License
167 stars 44 forks source link

distributed training? #16

Closed Jiayi719 closed 3 years ago

Jiayi719 commented 3 years ago

It seems that your code doesn't support distributed training. I’m trying to translate your code to mmdetection? I have some doubts:

  1. How much does the hard sampling improve the final performance?
  2. The scheduler in your code is quite strange. Can I use another one such as CosineAnnealing or MultiStep?

Looking forward to your reply. Thanks!

koyeongmin commented 3 years ago
  1. Hard sampling module is not a major factor, but in CuLane dataset, the hard sampling shows more improvement than TuSimple dataset.
  2. Yes!, My scheduler is not good.. You can use another scheduler! Thank you!
Jiayi719 commented 3 years ago

It seems that learning rate will never change during training in your code? I have tried a CosineAnnealingLR, but the peformance degrades a lot. Screenshot from 2020-10-30 15-30-07

The confidence loss and attention loss look strange. Do you have any idea about this? Thank you

Jiayi719 commented 3 years ago

Can you show me your result on tusimple dataset without hard sampling? I got a result with mmdetection so follows:

Accuracy 0.9388447417068856 FP 0.09802043750641848 FN 0.062065660196501254

koyeongmin commented 3 years ago

https://github.com/koyeongmin/PINet This is the previous version of PINet, and this model got 96.64% (accuracy) without hard sampling. Unfortunately, the result of the current version without hard sampling does not remain, but I remember the performance is similar to the previous version. Thank you!

Jiayi719 commented 3 years ago

Thank you so much. I will try to tune some other parameters. By the way, do you remember the result on tusimple dataset without data agumentation?

koyeongmin commented 3 years ago

Sorry, I don't remember it because I have always used data augmentation after first some trials.

Jiayi719 commented 3 years ago

OK. Do you have any idea about the existence loss. I see you comment out confidences = torch.sigmoid(confidences) I tried sigmoid on the confidence, the loss curve is perfect but the accuracy drops a bit.

koyeongmin commented 3 years ago

I also tried other loss like sigmoid and cross-entropy, but performance is not good...

Jiayi719 commented 3 years ago

Thank you so much for your patience and your nice work.

Jiayi719 commented 3 years ago

@koyeongmin Have you ever tried dice loss and lovasz hinge loss? I tried these losses on the confidence branch with thresh_point=0.5. The performance got a little improvement. I think the dice loss is suited for tusimple evaluation and the lovasz hinge loss is suited for culan evaluation (miou metric).

koyeongmin commented 3 years ago

Thank you! I have not tried these losses. I wiil try at future works. Thank you!

Jiayi719 commented 3 years ago

Hello, have you ever studied self-attention-distillation (https://github.com/cardwing/Codes-for-Lane-Detection)? I think your attention loss have the same intuition with it. But how does it work when we only have one hourglass?

mvish7 commented 3 years ago

@Jiayi719 did you succeed in distributed training of PINet?? and did you use PyTroch DDP for it??

Jiayi719 commented 3 years ago

@mvish7 Yes. I used DistributedDataParallel to implement it.

Wolfwjs commented 2 years ago

Hello,could you please share your code or core code of PINet based mmdetection? thx!

mvish7 commented 2 years ago

Hi @Wolfwjs Unfortunately I cant share the code. I used torch DDP. The problem I had was related to not using all the outputs of the model. I.e. i used some detach commands in wrong places.

Wolfwjs commented 2 years ago

thank you for your reply! @mvish7

Wolfwjs commented 2 years ago

Hello,have you successfully run PINet based on mmdection? @Jiayi719