Open nisargshah1999 opened 2 years ago
The reason, I am wondering this, is, on my dataset,
I am getting, around 88% test phase accuracy, using resnet whereas, using tecno and (trans-svnet too), the accuracy falls down to 35% It would be great if you could suggest me on it Thanks
You should re-implement TeCNO or other temporal methods with good results first to get feasible temporal embeddings for Transformer.
ok.. cool.. thank you very much for your help Could you mention about the training time of your TECNO code, and would be great if you could mention about the training input shape of tecno part, for me, its (batch_size , 2048 , number_of_frames_videos) Not, sure, how the sequence parameter used in generate_lfb.py is inculcated here.
Training both TeCNO and TransSVNet is very fast given spatial embeddings, several minutes maybe. The input TeCNO of is a whole video during training for convience, as it does not use future information. I only used generate_lfb.py to generate spatial embeddings, although it could also generate spatio-temporal embeddings.
Hi, Thanks for helping me with previous question. Also, I was able to run the code till generate_lfb.py Later, for length of sequence = 10, while running tecno.py and transformer code, each epoch takes only 20 sec on single GPU. I am not sure, if it is observed property, and also my accuracy drops compared to just using resnet code I would be obliged if you could suggest on that Thanks