About first step COCMap training? - Githubissues

junjun-jiang / BaMBNet

The official repository for BaMBNet

22 stars 2 forks source link

About first step COCMap training? #2

Closed pingjun18-li closed 3 years ago

pingjun18-li commented 3 years ago

Thanks for projects again! I used the config you given to train 20 epochs Cocmap on dataset of Canon by the preprocessing code 'image_to_patch_filter.py' , and the inference results are shown in the figure below Figure 1, all pixels is black with the value >1000.There are several questions: 1.The pixel value of the image is uint16, and the effect of the sub-area described in the paper cannot be seen. I would like to know whether my training is wrong or yours is the same. 2.Why three channel and not a single channel, and what's the meaning of the pixel value on each result graph? In addition, I modified the config "niter: 500000 epoch: 300 #20" , trained it at about 45 epoch(70 000 iter), and the phenomenon of unsupervised loss=0 would appear. Then, the test results were shown in Figure 2, is that normal?

junjun-jiang commented 3 years ago

Thanks for your attention. Please followed the settings of our paper before getting normal results. The value of COCmap is limited to between -25 and 25 during training (see the paper for more details).

I don't understand your problem, give more description about it.
The COCmaps can show the blur amount of the image. (your can leverage the one channel for more smoothly results, we choose three channel because that the RGB have different wavelengths)

I think the results are far from the expectations.

yarqian commented 3 years ago

@pingjun18-li I got the same result like Figure 1. Have you addressed the problem?

junjun-jiang commented 3 years ago

I think you should attach the training logs to better understand your questions (e.g. TensorBoard log and training log).

yarqian commented 3 years ago

train_COCMAP_PARAM_TRAIN_210706-115411.log Here is the training log. And I just follow the original setting.

junjun-jiang commented 3 years ago

The training log is quite different form mine at the begin. Normally, the initial loss is approximately 10.

models/kernel_de_bparam_net.py Does you uncomment it?

yarqian commented 3 years ago

Yes, but the original setting is from -25 to 25. And I will retrain the model soon.

junjun-jiang commented 3 years ago

The setting (-25, 25) may cause training to crash to some extent. Changing it to (-24, 24) is a simple solution. Of course, you can modify the code of the loss function to avoid boundary exception.

yarqian commented 3 years ago

Changing it to (-24, 24) doesn't work.

junjun-jiang commented 3 years ago

If you still suffer from a terrible initial loss value at the beginning, I suggest reimplement the repo in other devices (e.g. NVIDIA 3090 GPU).

yarqian commented 3 years ago

Do you mean the loss is correlated with the type of device?

junjun-jiang commented 3 years ago

What I want to say is that the initial parameters of your network are poor, which leads to the initial solution of the network is too poor and can not be optimized at the beginning. In my experiment settings, the GPU we chose was NVIDIA 3090.

yarqian commented 3 years ago

Would you mind uploading the pretrained model for COC map estimation?

junjun-jiang commented 3 years ago

Leave your email, I will provide you with an initial model with proper parameters.

yarqian commented 3 years ago

Many thanks and my mail address is yarqian24@gmail.com.