Epiphqny / VisTR

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers
https://arxiv.org/abs/2011.14503
Apache License 2.0
738 stars 96 forks source link

Num_queries hard coded #39

Open anirudh-chakravarthy opened 3 years ago

anirudh-chakravarthy commented 3 years ago

Hi,

Thank you for sharing your work! I'm trying to replicate results on a 12 GB GPU by reducing the num_frames and num_queries parameters. However, I came across the following error:

outputs_seg_masks = outputs_seg_masks.reshape(1,360,outputs_seg_masks.size(-2),outputs_seg_masks.size(-1))
RuntimeError: shape '[1, 360, 75, 76]' is invalid for input of size 1710000

I pinpointed the issue to Line 126, where I think 360 should be replaced with self.vistr.num_queries. Could you correct this in your release?

Also, can you explain what 24 denotes in Line 115?

Thanks!

elvindp commented 2 years ago

Hi,

Thank you for sharing your work! I'm trying to replicate results on a 12 GB GPU by reducing the num_frames and num_queries parameters. However, I came across the following error:

outputs_seg_masks = outputs_seg_masks.reshape(1,360,outputs_seg_masks.size(-2),outputs_seg_masks.size(-1))
RuntimeError: shape '[1, 360, 75, 76]' is invalid for input of size 1710000

I pinpointed the issue to Line 126, where I think 360 should be replaced with self.vistr.num_queries. Could you correct this in your release?

Also, can you explain what 24 denotes in Line 115?

Thanks!

could you get any results, after correct the hardcode? I keep got wrong prediction class and bbox results. Appreciate it if you can help me. Thx

Epiphqny commented 1 year ago

@anirudh-chakravarthy Thanks very much for your advice! But as there has been a long time and I have no extra time to review the code now, I will correct it in the future.