Closed shanxiaojun closed 1 year ago
Are you talking about the sky masks? There are some scenarios where they provide a slightly cleaner dynamic/background decomposition but they're not necessary - all of the results in the paper don't use them. I just included that code for reference in case people were curious to experiment.
Maybe I misunderstood the meaning of the code. Could you please give me a detailed introduction on how you distinguish dynamic objects like moving cars and people from backgrounds to help me examine whether I have a correct understanding? Many thanks.
SUDS renders the world using separate branches for static, dynamic, and far-field content (eg: the sky). The dynamic content is keyed on position, time, and video id, so content that's not consistent across time and videos will naturally gravitate towards that branch. We also have some losses that discourage the scene from explaining everything with the dynamic branch, or having things be partially in different branches.
Thanks for your great work! I wonder how you decompose dynamic objects and backgrounds. You claim that you use no GT, but it seems you use GT masks to distinguish dynamic objects from the background.