Closed sarimmehdi closed 4 years ago
What downstream tasks are you trying to do on Kitti?
My idea is to extract features from the first N frames and then concatenate those features (after pooling and fully connected layer) with the corresponding bounding box encoded features (these are obviously extracted using a traditional encoder RNN framework and not yours) before sending them at each time step of a decoder to predict bounding box coordinates
I tried changing many hyperparameters like learning rate and weight decay (increased and decreased both of them). Total images in my training set are 5400 and 2160 in the validation set. With a batch size of 16, that is not so many image sequences to train with, especially with num_seq set to 8 and seq_len set to 5 (I also tried with 3).
I noticed that no matter how much I change the hyperparameters, the loss function gets to 3.8 and the top1 accuracy gets to 0.220 within the first 10 epochs and after that, there is little to no change whatsoever. In fact, I waited around for 300 epochs but the loss and top1 accuracy stayed the same right till the end.
I haven't made any changes to your architecture whatsoever (just cloned your repository and ran the code, I had to write a different CustomDataset class of course to load the kitti images). I think it is possible that your self-supervision probably doesn't work on datasets like kitti? I am honestly quite new to all this (started neural networks two months ago), so I can't understand the real reason here. Maybe you have a better idea of what could possibly be going wrong here?
Thanks
Hi. Several reasons.
Thank you very much for your help. I will definitely give this a look
Hello. I am using your network for self-supervision representation on the kitti dataset. Even after 22 epochs the loss function and top1 accuracy barely change. What possible reasons could explain this?