Closed Songxinlei closed 1 year ago
Short answer: Simply because our dataset was in uint8 format.
Long answer: This essentially boils down to training on LDR (uint8) or HDR (float16/32) data. Even though HDR data is objectively better due to it's higher precision and dynamic range, it may actually perform worse in training due to the much greater domain the network needs to cover. Especially regarding generalization. Thus, HDR input is usually mapped to uint8 (tonemapped) before applying the model and it's output later re-mapped to HDR using the inverse. So the model basically works with LDR data anyways, no matter the input data type. If you want to experiment with float32 data, however, go for it!
You're welcome! In my experiments, float16 data (also supported by hdf5) is the best middle ground for HDR data regarding storage space/training time vs. dataset quality. I can not tell you if HDR training is necessary for your use-case, though. If so, I would recommend to add some sort of tonemapping to map your HDR input to [0,1] and finally re-map to HDR, such as simple log mapping, for example. That is currently not supported in the code and you will need to add it yourself, since we worked with LDR data only. That probably also helps with general quality and image sharpness, epecially when using L2 losses, since the model tends to resort to blur if it can't fully resolve the noise. Maybe experimenting with different loss functions may help, too. Best of luck with your project!
Thank you for your suggestion! Thank you so much! I like you very much and have read many of your articles, including the latest ones. I will try my best and thank you again.
Why the model dataset uses unit8 type instead of float32 high-type data for training? Aren't the results of highly dynamic data training closer to the real world?