Possible reasons for loss function not going down

sarimmehdi commented 4 years ago

Hello. I am using your network for self-supervision representation on the kitti dataset. Even after 22 epochs the loss function and top1 accuracy barely change. What possible reasons could explain this?

TengdaHan commented 4 years ago

What downstream tasks are you trying to do on Kitti?

sarimmehdi commented 4 years ago

My idea is to extract features from the first N frames and then concatenate those features (after pooling and fully connected layer) with the corresponding bounding box encoded features (these are obviously extracted using a traditional encoder RNN framework and not yours) before sending them at each time step of a decoder to predict bounding box coordinates

sarimmehdi commented 4 years ago

I use batch size of 32.
The final feature size is 5 by 4 by 4 by 256 and actually I am trying to capture the motion of vehicles from an egocentric perspective (kitti data is taken from the driver's seat), so the vehicles can sometimes appear a bit small if they are reasonably far away but almost every time the vehicles do come closer and are usually big enough.
One more thing I noticed was that right after training began, the loss dropped from 16 to 4.5 (right after 5 iterations of the first epoch) and it just stayed there up till the 22nd epoch.

sarimmehdi commented 4 years ago

I tried changing many hyperparameters like learning rate and weight decay (increased and decreased both of them). Total images in my training set are 5400 and 2160 in the validation set. With a batch size of 16, that is not so many image sequences to train with, especially with num_seq set to 8 and seq_len set to 5 (I also tried with 3).

I noticed that no matter how much I change the hyperparameters, the loss function gets to 3.8 and the top1 accuracy gets to 0.220 within the first 10 epochs and after that, there is little to no change whatsoever. In fact, I waited around for 300 epochs but the loss and top1 accuracy stayed the same right till the end.

I haven't made any changes to your architecture whatsoever (just cloned your repository and ran the code, I had to write a different CustomDataset class of course to load the kitti images). I think it is possible that your self-supervision probably doesn't work on datasets like kitti? I am honestly quite new to all this (started neural networks two months ago), so I can't understand the real reason here. Maybe you have a better idea of what could possibly be going wrong here?

Thanks

TengdaHan commented 4 years ago

Hi. Several reasons.

The motion you want to encode (a few pixel change) may be too tiny for our DPC. DPC aims to learn high-level representations, like the action class level, and we deliberately avoid learning low-level features like appearance, texture, etc. But self-supervised tracking probably relies more on these lower-level features.
For video object tracking, I recommend you can check out CorrFlow and the paper. That could be more relevant to your task.

sarimmehdi commented 4 years ago

Thank you very much for your help. I will definitely give this a look

TengdaHan / DPC

Possible reasons for loss function not going down #7