Open August-en opened 1 year ago
This will be all right?
ref_pos_embed_list = torch.chunk(lvl_pos_embed_flatten, self.num_ref_frames+1, dim=0)
cur_pos_embed = lvl_pos_embed_flatten[0]
ref_pos_embed = torch.cat(ref_pos_embed_list[1:], 1)
ref_memory = ref_memory + ref_pos_embed
I get this error as well when the batch size is greater than one. When batch size is one this error does not appear. The code on the repo does not appear to work for an arbitrary batch size currently
Try with this PR #13 for batchsize > 1. I tested it and it works, though would not recommend if you do not have gpus with > 32GB memory.
Try with this PR #13 for batchsize > 1. I tested it and it works, though would not recommend if you do not have gpus with > 32GB memory.
Thank you so much. I will try it as soon as possible.
Have you used the TDTE module in your expriments? I found that the default setting about TDTD in this repo is False. (https://github.com/SJTU-LuHe/TransVOD/issues/27#issue-1451857301). Does it make a big difference whether to use it or not?
Thank you again if you can share your experience :)
Yes, I did experiments with and without TDTE on my own dataset and the performance was almost the same. What worked in my case to increase performance was to add Illumination variation augmentation and class weights in the loss. I also reproduced the results on ImageNet VID from this page with TDTD set to False, therefore I don't know if it worth adding it or not, it may depend on your dataset.
https://github.com/SJTU-LuHe/TransVOD/blob/ef864f81036562799ad9c29440200d9b70165a90/models/deformable_transformer_multi.py#L226-L229