Processing images from video

facebookresearch / vggsfm

[CVPR 2024 Highlight] VGGSfM Visual Geometry Grounded Deep Structure From Motion

Other

441 stars 28 forks source link

Processing images from video #9

Open bhack opened 1 month ago

bhack commented 1 month ago

Is there any specific setting to take advantage of images from video sequences?

jytime commented 1 month ago

Hi @bhack we are preparing for this and should be available in next version.

bhack commented 1 month ago

Do you have just another planned version or paper?

jytime commented 1 month ago

yes we are planning to release a new version of (a) a hugging face demo that can run on arbitrary input videos and (b) training script. The new version will also reduce the gpu usage by half

bhack commented 1 month ago

Do you will have also an option to mask non rigid pixels in video mode?

bhack commented 1 month ago

E.g. like the clustering in https://littlepure2333.github.io/GFlow/ or https://www.cis.upenn.edu/~leijh/projects/mosca/

bhack commented 4 days ago

@jytime So is there something available in the new V2?

bhack commented 1 day ago

In the case currently we cannot autocluster/classify rigid and non rigid points can you at least support an init mask on the dynamic parts in the tracker?

jytime commented 1 day ago

Hey,

Currently, we do not have an update on the video processing feature as we are focusing on improving memory efficiency, as the mem problem seems to bother many people.

Regarding non-rigid pixels, you have two options:

(a) Avoid using non-rigid points as query points. You can achieve this by passing a mask to the get_query_points() function https://github.com/facebookresearch/vggsfm/blob/8c47df83eb61071a09b029c784e88431d6ea400e/demo.py#L453

(b) set all non-rigid points as non-visible, such as done at:

https://github.com/facebookresearch/vggsfm/blob/8c47df83eb61071a09b029c784e88431d6ea400e/demo.py#L236 All the non-visible will be ignored during triangulation.