Closed mengxia1994 closed 1 year ago
@mengxia1994 It seems the 0 loss does not appear right after validation? Does this happen every time?
Yes. But after the first validation, The aux0 loss will always be 0
Yes. But after the first validation, The aux0 loss will always be 0
that is interesting, could you see similar behavior for other algorithms(other than lstr)?
@mengxia1994 I think this could be a bug. Because I rewirte eval() in line 109 of lstr code. I don't have my laptop, could you try add a corresponding train() method that sets everything to True?
Yes. But after the first validation, The aux0 loss will always be 0
that is interesting, could you see similar behavior for other algorithms(other than lstr)?
Not yet. After a simple trial on all algorithms, I chose lstr to do some more digging and finetune. I tried to add the code above because there is an overfitting and i'm not used or convenient to use tensorboard. Are you suggesting that your first reaction is that it may because of lstr other than some torch settings?
@mengxia1994 I think this could be a bug. Because I rewirte eval() in line 109 of lstr code. I don't have my laptop, could you try add a corresponding train() method that sets everything to True?
In which file?
find it, I will have a try.
@mengxia1994 I have verified this bug, it affects LSTR, BézierLaneNet & RepVGG, and this should fix it. Thanks a lot for pointing this out!
@mengxia1994 I have verified this bug, it affects LSTR, BézierLaneNet & RepVGG, and this should fix it. Thanks a lot for pointing this out!
Thank you for your quick fix, I will add these new code~ Actually I tried to add a rewrite train function as you suggested(lol). The training is still on but the print information and val result seems that it worked. If you have time please see whether my modification makes sense. I add a train function in lstr.py after eval:
def eval(self):
super().eval()
self.aux_loss = False
self.transformer.decoder.return_intermediate = False
def train(self, mode=True):
super().train(mode)
self.aux_loss = True
self.transformer.decoder.return_intermediate = True
rewrite train
def train(self, mode=True): super().train(mode) self.aux_loss = True self.transformer.decoder.return_intermediate = True
Yes, this mod should bring the same correct behavior for LSTR.
This issue seems to be addressed. I'll close it for now. Feel free to continue commenting for reopen/open a new one.
I did a little change in LaneDetTrainer and base.py, I comment the fast_evaluate code('Only segmentation based methods can be fast evaluated!') and use the LaneDetTester test_one_set function like below. Trying to add val part among epochs.
if self._cfg['validation']:
as you can see, I add self.model.train() like the fast_evaluate code. However, the loss became 0 after test part like below:
[2, 980] training loss: 29.0062 [2, 980] loss label: 0.6740 [2, 980] loss curve: 2.1232 [2, 980] loss upper: 0.4012 [2, 980] loss lower: 0.6994 [2, 980] training loss aux0: 14.1670 [2, 980] loss label aux0: 0.7017 [2, 980] loss curve aux0: 2.0173 [2, 980] loss upper aux0: 0.2924 [2, 980] loss lower aux0: 0.6955 start validation on epoch: 1 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 227/227 [00:02<00:00, 96.23it/s] [{"name":"Accuracy","value":0.016865079365079364,"order":"desc"},{"name":"FP","value":0.019089574155653453,"order":"asc"},{"name":"FN","value":0.993208516886931,"order":"asc"}] Epoch time: 38.95s [3, 80] training loss: 17.1586 [3, 80] loss label: 0.6709 [3, 80] loss curve: 2.0960 [3, 80] loss upper: 0.4002 [3, 80] loss lower: 0.7055 [3, 80] training loss aux0: 2.4547 [3, 80] loss label aux0: 0.1318 [3, 80] loss curve aux0: 0.3381 [3, 80] loss upper aux0: 0.0546 [3, 80] loss lower aux0: 0.1298 [3, 179] training loss: 13.8403 [3, 179] loss label: 0.6764 [3, 179] loss curve: 1.9323 [3, 179] loss upper: 0.3870 [3, 179] loss lower: 0.6878 [3, 179] training loss aux0: 0.0000 [3, 179] loss label aux0: 0.0000 [3, 179] loss curve aux0: 0.0000 [3, 179] loss upper aux0: 0.0000 [3, 179] loss lower aux0: 0.0000 [3, 278] training loss: 14.7566 [3, 278] loss label: 0.6881 [3, 278] loss curve: 2.1140 [3, 278] loss upper: 0.3753 [3, 278] loss lower: 0.6859 [3, 278] training loss aux0: 0.0000 [3, 278] loss label aux0: 0.0000 [3, 278] loss curve aux0: 0.0000 [3, 278] loss upper aux0: 0.0000 [3, 278] loss lower aux0: 0.0000
If I use LaneDetTester after the whole training it works fine. need some advices~