KugaMaxx / cuke-emlb

A benchmark for event-based denoising.
MIT License
30 stars 4 forks source link

question about the results of the ESR experiment #5

Closed Bin1119 closed 7 months ago

Bin1119 commented 7 months ago

Hello, I have a question about the results of the ESR experiment. Most MESR results of TABLE II in your paper are less than 1, but the ND1-ND64 results of your benchmark code, such as ynoise method, are almost greater than 1, I would like to ask what are the reasons for the difference in experimental results?

Bin1119 commented 7 months ago

1 2 The figure shows the average ESR results of ND00-ND64 obtained by the EMLB benchmark tested by ynoise, which is quite different from the experimental results in the paper. I would like to ask what went wrong? Because I want to use your EMLB dataset as the benchmark, but the experimental results are inconsistent with your paper, which makes me very confused. I am looking forward to your answer.

KugaMaxx commented 7 months ago

Hi, @Bin1119 ! Thank you for supporting and using my repository. As you can see, I have fixed this issue in the latest update. The reason for this phenomenon is that we used an initial version of ESR in the original text. But I used a modified version of ESR (not yet published) in this repository, which resulted in slight differences in the calculation results. I have replaced ESR with the version used in the original text. If you still want to try the modified version of ESR, you can use EventStructuralRatioV2() to call it.

However, I would like to emphasize that each scenario in the paper was manually fine-tuned to achieve the best denoising effect. Therefore, you may not be able to strictly reproduce the ESR results from the paper but, theoretically, your test results should be similar to the results in the original text.

Any questions, please continue to contact me.

Bin1119 commented 7 months ago

dwf knoise raw ts ynoise Hi, @KugaMaxx ! I used your updated EventStructuralRatio function to recalculate the average ESR of the benchmark, but it was still quite different from that in the paper. Even though I calculated ESR with raw data without any denoiser, the result was different from that in your paper. This shows that the problem is not with the denoiser parameter, but rather with the EventStructuralRatio function itself. Below is my eval_benchmark code. You can run it again on your computer and see the results. I hope you could look again at where the metric code is going wrong, and the result of the EventStructuralRatio function could ensure that the mean ESR for raw data are the same as for the paper, even without any denoiser.

import argparse
import os
import os.path as osp
from tqdm import tqdm
from tabulate import tabulate
from datetime import timedelta
import numpy as np

# Import dv related package
import dv_processing as dv
import dv_toolkit as kit

# Import project config
from configs import Dataset, Denoisor

# Import evaluation metric
from python.src.utils.metric import EventStructuralRatio, EventStructuralRatioV2

if __name__ == '__main__':
    # Arguments settings
    parser = argparse.ArgumentParser(description='Run E-MLB benchmark.')
    parser.add_argument('-i', '--input_path',  type=str, default='/mnt/d/datasets/EMLB', help='path to load dataset')
    parser.add_argument('-o', '--output_path', type=str, default='./results/ynoise', help='path to output dataset')
    parser.add_argument('--denoisor', type=str, default='dwf', help='choose a denoisor')
    parser.add_argument('--store_result', action='store_true', help='whether to store denoising result')
    parser.add_argument('--store_score', action='store_true', help='whether to store evaluation score')
    args = parser.parse_args()

    # Recursively load dataset
    datasets = Dataset(args.input_path)
    for i, dataset in enumerate(datasets):

        # Initialize on-screen info
        pbar = tqdm(dataset)
        table_header, table_data = ['Sequence', 'ESR Score'], list()

        class_scores = {}

        for sequence in pbar:
            # Parse sequence info
            fpath, fclass, fname = sequence
            fname, fext = osp.splitext(fname)
            cla = fname.split('.')[0].split('-')[-2]
            # print(fname.split('.')[0].split('-')[-2])
            fdata = fpath.split('/')[-1]

            # Print progress bar info
            pbar.set_description(f"#Denoisor: {args.denoisor:>7s},  " +
                                 f"#Dataset: {fdata:>10s} ({i+1}/{len(datasets)}),  " +
                                 f"#Sequence: {fname:>10s}")

            # Load noisy file
            reader = kit.io.MonoCameraReader(f"{fpath}/{fclass}/{fname}{fext}")

            # Get Offline data
            data = reader.loadData()

            # Get resolutiong
            resolution = reader.getResolution("events")

            # Register event structural ratio
            metric = EventStructuralRatio(resolution)

            # Register denoisor
            model = Denoisor(args.denoisor, resolution)

            # Receive noise sequence
            model.accept(data["events"])

            # Perform event denoising
            data["events"] = model.generateEvents()

            # Store denoising resultees
            if args.store_result:
                output_path = f"{args.output_path}/{args.denoisor}/{fdata}/{fclass}"
                output_file = f"{output_path}/{fname}{fext}"
                if not osp.exists(output_path): os.makedirs(output_path)

                # Writing
                writer = kit.io.MonoCameraWriter(output_file, resolution)
                writer.writeData(data)

            # Store evaluation metric
            if args.store_score:
                score = metric.evalEventStorePerNumber(data["events"].toEventStore())
                mean_score = np.nanmean(score)
                if not np.isnan(mean_score):
                    table_data.append((fname, mean_score))

            score = metric.evalEventStorePerNumber(data["events"].toEventStore())
            if cla not in class_scores:
                class_scores[cla] = []

            class_mean_score = np.nanmean(score)
            if not np.isnan(class_mean_score):
                class_scores[cla].append(class_mean_score)

        # Print ESR score
        if len(table_data) != 0:
            print(tabulate(table_data, headers=table_header, tablefmt='grid'))

        for cla, scores in class_scores.items():
            if len(scores) > 0:
                class_mean = sum(scores) / len(scores)
                print(f"{cla} Mean Score: {class_mean}")

Looking forward to your reply, thank you very much~

KugaMaxx commented 6 months ago

Hi, @Bin1119 ! I have checked and found that the reason for the low ESR result is that I added a media filter to suppress the hot pixel, which has been removed now. You can try again. Good luck. screen_shot