When you load data, you create a mask per image in the video_collate_fn. It is unclear to me what is the purpose of the mask, and what exactly it is used for. Could you clarify that?
IIRC this is related to padding, to check which spatial part is padded or not. Not sure this is actually used in practice in the experiments as they were done with 1 video per batch.
Hi,
When you load data, you create a mask per image in the video_collate_fn. It is unclear to me what is the purpose of the mask, and what exactly it is used for. Could you clarify that?
Kind regards,