Closed saivineethkumar closed 4 years ago
If the folder structure is up to you, I'd suggest you follow the same one as DHF1K does. That way you can use the same data loader and it will be much more straightforward. In summary: There should be two superfolders a "frames" and a "maps" so that in one you have the folders of frames for each video and in the other folders of saliency maps. So each video is a numbered folder (1-700) then each folder contains numbered frames (0001.png, 0002.png etc). So: frames - > 1 - > 0001.png, 0002.png maps - > 1 - > 0001.png, 0002.png etc You might want to check DHF1K in case I'm missing something: https://github.com/wenguanwang/DHF1K
After that you just have to set your dataset=DHF1K or dataset=other (I think the only difference between these two is that "other" does not require you to number the folders)
As for the training hyperparameters: frame_size = (192, 256) learning_rate = 0.000001 decay_rate = 0.1 momentum = 0.9 weight_decay = 1e-4 clip_length = 10 epochs = 7 batch_size = 1
With regards to the alpha, we found that it is feasible to make it a parameter and achieves just as good performance as fine tuning it as a hyperparameter, saving you the trouble. Note that it has to be trained with a much higher learning rate (lr=0.1 in this case). Just input alpha=None and alpha will be learned like that. However, the pretrained model which was uploaded on github had alpha set to 0.1.
@Linardos Hi, I want to train the model on a custom dataset, can you describe the steps involved and details like the structure and format of the training input and annotation data, the training parameters you used for the training etc.
thank you.