Also in your paper, it is mentioned that To construct event representations, we consider 50 ms time windows that are discretized into T = 10 bins. This would effectively mean each event frame is formed using events from the previous 5ms?
Is there a reason why the annotation frequency is chosen as 10Hz when the time window is at 20Hz?
Yes, it is 10 Hz for the Gen4 (1mpx) dataset. There are two reasons why I chose 10 Hz:
Sometimes the frequency of the ground truth is 60 Hz but sometimes also less, for example, 30 Hz. I wanted to have ground truth in regular intervals. Since inference is done at 20 Hz, 10 Hz ground truth was the natural choice since 30 is not divisible by 20.
Ground truth at 10 Hz is already quite a high frequency and quite redundant. Maybe it would help to have it even higher during training but that would also slow down training since some parts of the loss calculation cannot be parallelized.
Hi @magehrig, thanks for the great work and open-sourcing the code. I had a couple of doubts regarding the preprocessing.
https://github.com/uzh-rpg/RVT/blob/b80f5683a6e2d5de65d4bde8105d796ccb50dbb1/scripts/genx/preprocess_dataset.py#L302 From the above line, I assume the annotation frequency chosen is 10Hz, right?
Also in your paper, it is mentioned that
To construct event representations, we consider 50 ms time windows that are discretized into T = 10 bins.
This would effectively mean each event frame is formed using events from the previous 5ms?Is there a reason why the annotation frequency is chosen as 10Hz when the time window is at 20Hz?