wanghao9610 / TMANet

Official implementation of TMANet.
Apache License 2.0
122 stars 24 forks source link

How about fist frame in a video #23

Closed dreamer121121 closed 2 years ago

dreamer121121 commented 2 years ago

how to predict the first frame in a video ?

wanghao9610 commented 2 years ago

It is not an issue on Cityscapes and CamVid datasets because they don't exist a label on the first few frames. If you want to infer the first frame, you can copy the first frame multi times (sequence length, i.e. 2 or 4) for simplicity.

dreamer121121 commented 2 years ago

Thanks a lot, i have another question about the infer speed, i test the infer speed in 1080Ti , it is about 1fps, It's that True? and why the speed is so slow?

wanghao9610 commented 2 years ago

I have tested an image with shapes (769, 769) on cityscapes in TITAN XP, averaged 100 results, time per image is about 400~500 ms/image, fps should be about 2. It is that TMANet is a temporal-spatial attention method of which complexity is O(TN^2), the inference speed is slow as a result. We will try to improve the inference speed in our future work.