AndersonCotrim / SBFBurst

SBFBurst: This is the official implementation of VISAPP 2024 "Simple Base Frame Guided Residual Network for RAW Burst Image Super-Resolution".
Other
6 stars 0 forks source link

Your code crashes in synthetic training #1

Open nonick2k23 opened 3 weeks ago

nonick2k23 commented 3 weeks ago

You have fallback code - which retrains an epoch if any batch fails due to NaNs in PSNR

But, it still doesn't work properly (ignoring that this fallback is not an optimal fix)

This is what I receive during training synthetic phase:

[train: 2, 550 / 1000] FPS: 23.2 (23.5) , Loss/total: nan , Loss/rgb: nan , Loss/raw/rgb: nan , Stat/psnr: 31.68984

Which seems to bypass your solution somehow - which means training cannot proceed.

Can you have a look or provide a solution for this issue.

Thank you

AndersonCotrim commented 3 weeks ago

Hi, you're right, the fallback is useless in this scenario, you can disable it. I've noticed this issue occasionally in my experiments, and I guess it's due to unstable predictions with Spynet in first epochs. Did you modify any parameters in the training script? Or synthetic generation code? Maybe using a smaller initial learning rate could help resolve the problem.