Lqm26 / BMCNet-ESR

Bilateral Event Mining and Complementary for Event Stream Super-Resolution
8 stars 0 forks source link

Confusion about the proposed event super resolution. #2

Open rfww opened 1 month ago

rfww commented 1 month ago

Thanks for your work. As shown in Sec. 3.1, three steps are included in the event data super-resolution (1. event representation, 2. representation super-resolution, and 3. event stream recovering). However, the MSE loss function is adopted in Sec. 3.4 to measure the difference between two event count images. How do you convert the event stream from this event representation? If only upsample the event representation, what is the difference between the event super-resolution and image super-resolution?

Lqm26 commented 1 month ago

Like the previous work, we can using random resample to obtain the event stream, but the timestamps of events cannot fully recovered . The difference between event super-resolution and image super-resolution can be summarized in three points: (1) the data format is different. Event count images are sparse, which makes the traditional image super-resolution methods, such as SRFBN, RLSP, and RSTT used in our comparison, potentially ineffective. (2) Event streams includes both positive and negative events, each with unique distribution characteristics. This necessitates the design of specialized modules and networks. (3) Event stream contain noise, and it's essential to mitigate the impact of the noise during super-resolusion.

rfww commented 1 month ago

Yeah, the format difference is a significant obstacle to directly adopting any image-based techniques to process event streams. However, if you convert them into frame-like representations first and then only upsample the corresponding representations, the aforementioned three points are not reasonable. My main concern is the lack of event stream recovery illustrated by you in Sec. 3.1.

Lqm26 commented 1 month ago

The resampled event stream cannot be farely comparing due to the loss of the timestamps, so we chose to compare them using event count image. Additionally, another method of comparison is to evaluate their performance in downstream tasks such as Object Recognition and video reconstruction (section 4.4).

rfww commented 1 month ago

Exactly, the lost timestamps cannot be recovered when the event stream has been represented. That's why I always ask you for the third point illustrated in Sec. 3.1.

Lqm26 commented 1 month ago

The timestamps can be partially recovered. For example, we can assign a time interval to each event count images and randomly resample events within this interval. In this way, we can generate a timestamp for each event. These timestamps are random within an event count image, they maintain an order between event count images.

rfww commented 1 month ago

These recovered timestamps are fake counterparts, which would cause catastrophic results when adopting EST as the event representation method. The upsampled event stream is still not generated.

Lqm26 commented 1 month ago

In practice, you can mitigate this impact by using a small time interval (e.g.,10 ms) and limiting the numbers for each event count image. Additionally, this super-resolution method may be more sutilable for the tasks that use event count images, such as object recognition, classification, and tracking.

rfww commented 1 month ago

Thanks for your quick reply. I think you already know my main concern in the aforementioned comments. The problem formulation is different from your implementation instead of its applications. In image super-resolution, there's no one would only upsample the converted tensors without rendering the images, same for the event stream.

Lqm26 commented 1 month ago

Thanks for your question. We have attempted to visualize the high-resolution resampled event stream. However, the event stream was too dense to discern the detailed differences. As a result, we opted to use the event count images for quantitative and qualitative comparisons. Therefore,we believe that our formulation in the paper is appropriate.

rfww commented 1 month ago

The event count image is not the quantitative medium in your paper, instead of the original input and output samples.

Lqm26 commented 1 month ago

The event count image serves as the quantitative medium (see Section 3.4). And they are used as input and output samples too.