TonyHongtaoWu / RainMamba

[ACM MM'24 Oral] RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining
MIT License
62 stars 6 forks source link

Validation Process #5

Closed xtares13 closed 4 days ago

xtares13 commented 1 week ago

Hello,thank you for your awesome work!When I was training RainMamba, I found that there was no validation process and there were no validation set related settings in the config file. I would like to see the changes in PSNR and SSIM during the training process. How should I modify the code? image

TonyHongtaoWu commented 1 week ago

Thank you for following our work. I didn't print the validation process during training. But you can try to add the following code in the config file to implement the validation function. The below code is referenced from mmediting:

  1. Add the following code between train and test in the data dictionary.
val=dict(
        type='SRFolderMultipleGTDataset',
        lq_folder="../data/VRDS/test/lq",
        gt_folder="../data/VRDS/test/gt",
        pipeline=[
            dict(
                type='GenerateSegmentIndices',
                start_idx=0,
                interval_list=[1],
                filename_tmpl='{:08d}.png'),
            dict(
                type='LoadImageFromFileList',
                io_backend='disk',
                key='lq',
                channel_order='rgb'),
            dict(
                type='LoadImageFromFileList',
                io_backend='disk',
                key='gt',
                channel_order='rgb'),
            dict(type='RescaleToZeroOne', keys=['lq', 'gt']),
            dict(type='FramesToTensor', keys=['lq', 'gt']),
            dict(
                type='Collect',
                keys=['lq', 'gt'],
                meta_keys=['lq_path', 'gt_path', 'key'])
        ],
        scale=1,
        test_mode=True),
  1. And you also need to add
evaluation = dict(interval=5000, save_image=False, gpu_collect=True)

after checkpoint_config = dict(interval=5000, save_optimizer=True, by_epoch=False)

xtares13 commented 1 week ago

Thank you. I selected 10 videos from VRDS test as the validation set and added them according to your instructions. After training 5000 times and entering the validation process, I found that the progress bar was stuck here and my memory usage was too high. Do you know what's going on?I used two 24GB 3090. image

TonyHongtaoWu commented 1 week ago

Sorry, I've been a bit busy lately. I will debug this issue in a few days and you can test the checkpoints first to realize the validation process. You can also contact me at hwu375@connect.hkust-gz.edu.cn to add my contact details.

xtares13 commented 5 days ago

When I tested VRDS, I found that the memory usage of the process started to be normal. As the number of processed video clips increased, the memory usage also continued to increase. 7837dde170dafe773a71b3872f6aab7 image 656144aa43aff5c6a6491068eb5dca5 5cfefbd81fa9da07b2367cf10595cd9 When I tested 30 video collection results, I found that the memory had exploded.The process was killed.I can't see my metric results. d7a3b48c65adaf0c04f7ec4086bfcf9 After debugging, I found that the memory usage was continuously increasing while running the test_clip function. Is this normal? I didn't change the samples_per_gpu and workers_per_gpu in VRDS.py image

TonyHongtaoWu commented 5 days ago

Perhaps due to accidentally deleting some inference related components while organizing the code, I will fix them later. This issue will not affect the output results. You can use this code to test the metric.

xtares13 commented 4 days ago

Perhaps due to accidentally deleting some inference related components while organizing the code, I will fix them later. This issue will not affect the output results. You can use this code to test the metric.

OK,thank you so much!