Regarding whether the data type is trained with uint8 or float32？

nihofm / ndptmvd

Source code for our paper "Neural Denoising for Path Tracing of Medical Volumetric Data"

26 stars 7 forks source link

Regarding whether the data type is trained with uint8 or float32？ #3

Closed Songxinlei closed 1 year ago

Songxinlei commented 1 year ago

Why the model dataset uses unit8 type instead of float32 high-type data for training? Aren't the results of highly dynamic data training closer to the real world?

nihofm commented 1 year ago

Short answer: Simply because our dataset was in uint8 format.

Long answer: This essentially boils down to training on LDR (uint8) or HDR (float16/32) data. Even though HDR data is objectively better due to it's higher precision and dynamic range, it may actually perform worse in training due to the much greater domain the network needs to cover. Especially regarding generalization. Thus, HDR input is usually mapped to uint8 (tonemapped) before applying the model and it's output later re-mapped to HDR using the inverse. So the model basically works with LDR data anyways, no matter the input data type. If you want to experiment with float32 data, however, go for it!

Songxinlei commented 1 year ago

Thanks for your answer!
Yes, I tried to use float32 highly dynamic data to train the model, and the results were okay. However the generated h5 file is too large, and at the same time, the training time is also increased by 4-6 times. In addition, regarding what you said about reversing and remapping from LDR to HDR, I see that the code only converts the uint8 type to float32, but the numerical value is still between [0,1]. I used float32 data for training without clipping it at [0,1], and the maximum value exceeded 1. I also have a headache. If I use HDR data for training, the h5 file storage space and training time will be too long. So, I am wondering, is it really necessary to use HDR data to train the model? Or can the final medical volume data denoising effect be the best?
Also, do you have any suggestions about how the denoised image results don't look clear enough and the edges aren't sharp enough?

nihofm commented 1 year ago

You're welcome! In my experiments, float16 data (also supported by hdf5) is the best middle ground for HDR data regarding storage space/training time vs. dataset quality. I can not tell you if HDR training is necessary for your use-case, though. If so, I would recommend to add some sort of tonemapping to map your HDR input to [0,1] and finally re-map to HDR, such as simple log mapping, for example. That is currently not supported in the code and you will need to add it yourself, since we worked with LDR data only. That probably also helps with general quality and image sharpness, epecially when using L2 losses, since the model tends to resort to blur if it can't fully resolve the noise. Maybe experimenting with different loss functions may help, too. Best of luck with your project!

Songxinlei commented 1 year ago

Thank you for your suggestion! Thank you so much! I like you very much and have read many of your articles, including the latest ones. I will try my best and thank you again.