uzh-rpg / rpg_e2vid

Code for the paper "High Speed and High Dynamic Range Video with an Event Camera" (T-PAMI, 2019).
http://rpg.ifi.uzh.ch/E2VID.html
GNU General Public License v3.0
340 stars 125 forks source link

Event Camera Dataset Evaluation #19

Open chensong1995 opened 3 years ago

chensong1995 commented 3 years ago

Hello there,

Thank you for your great work! This is a follow-up question to #17. I would really appreciate it if you would provide some further clarification.

  1. I tried to cut the Event Camera sequences with the timestamps in that issue. I have:
dynamic_6dof count: 319
boxes_6dof count: 326
poster_6dof count: 341
shapes_6dof count: 340
office_zigzag count: 134
slider_depth count: 39
calibration count: 357

In total, this gives 1856 frames, which is a little bit different than the expected 1670 frames. Here is how I count the frames:

import numpy as np
import os
import pandas as pd

seq_config = [ {
            "name": "dynamic_6dof",
            "config": {
                "start_time_s": 5.0,
                "stop_time_s": 20.0
            }
        },
        {
            "name": "boxes_6dof",
            "config": {
                "start_time_s": 5.0,
                "stop_time_s": 20.0
            }
        },
        {
            "name": "poster_6dof",
            "config": {
                "start_time_s": 5.0,
                "stop_time_s": 20.0
            }
        },
        {
            "name": "shapes_6dof",
            "config": {
                "start_time_s": 5.0,
                "stop_time_s": 20.0
            }
        },
        {
            "name": "office_zigzag",
            "config": {
                "start_time_s": 5.0,
                "stop_time_s": 12.0
            }
        },
        {
            "name": "slider_depth",
            "config": {
                "start_time_s": 1.0,
                "stop_time_s": 2.5
            }
        },
        {
            "name": "calibration",
            "config": {
                "start_time_s": 5.0,
                "stop_time_s": 20.0
            }
        } ]

for seq in seq_config:
    data_dir = os.path.join('data', 'EventCamera', 'test', seq['name'])
    target_list_name = os.path.join(data_dir, 'images.txt')
    with open(target_list_name) as f:
        target_list = [(float(line.split()[0]), line.split()[1]) for line in f.readlines()]
    target_list.sort() # It should be sorted by default. Just be safe
    event_name = os.path.join(data_dir, 'events.txt')
    event_pd = pd.read_csv(event_name, delim_whitespace=True, header=None,
                           names=['t', 'x', 'y', 'pol'],
                           dtype={'t': np.float64, 'x': np.int16, 'y': np.int16, 'pol': np.int16},
                           engine='c',
                           chunksize=1)
    event_start = next(event_pd)['t'][0]
    count = 0
    for target_t, target_name in target_list:
        if target_t < event_start + seq['config']['start_time_s']:
            continue
        elif target_t > event_start + seq['config']['stop_time_s']:
            break
        count += 1
    print('{} count: {}'.format(seq['name'], count))
  1. What is the expected normalization before calculating MSE? I was thinking about casting the pixel values to [0, 1] but would like to make sure here.
  2. The definition of SSIM appears to have several parameters. I used the default ones from skimage. Is that also what you did?
  3. For LPIPS, my understanding is that the metric value is affected by the pretained network weights. I used the vgg network from this link. It will be helpful if you can let me know what network you use.

Thanks in advance!

midofalasol commented 2 years ago

@chensong1995 Hello, I also encountered the same evaluation indicator calculation problem.

When I calculated the SSIM index according to the method you said, the results obtained were far from the data in the paper. Did you get the results similar to those in the paper?

chensong1995 commented 2 years ago

@midofalasol Negative. I think there is definitely some data normalization issue going on here.

ercanburak commented 1 year ago

For anyone seeking quantitative evaluation, you can utilize EVREAL, our library designed to evaluate and analyze PyTorch-based event-based video reconstruction methods (including E2VID).