MiZhenxing / GBi-Net

Codes for GBi-Net (CVPR2022)
https://mizhenxing.github.io/gbinet
MIT License
124 stars 15 forks source link

"forward_one_depth" and "forward_all_depth" #13

Open ly27253 opened 11 months ago

ly27253 commented 11 months ago

First of all, thank you very much for your selfless sharing, this is an outstanding work.

Could you please tell me the difference between "forward_one_depth" and "forward_all_depth" in detail, and if I want to use " forward_all_depth", what do I need to change in the code, in other words: is it possible to do it in the config file?

I'd appreciate it if you could reply!

MiZhenxing commented 11 months ago

Hi, thank you for your question. forward_one_depth only computes the depth map of one binary tree depth. forward_all_depth computes the depth maps of all binary tree depths and gets the final depth map. The use of forward_one_depth and forward_all_depth is controlled by the mode argument in the forward function of GBiNet in this line: https://github.com/MiZhenxing/GBi-Net/blob/28a109a3a569311bd4e653000f08ccb64e925b0e/models/gbinet.py#L230

If use what to use forward_one_depth, you need to set mode as "one", otherwise as "all".

We use mode="one" in training: https://github.com/MiZhenxing/GBi-Net/blob/28a109a3a569311bd4e653000f08ccb64e925b0e/train_gbinet.py#L66

And we use mode="all" in testing: https://github.com/MiZhenxing/GBi-Net/blob/28a109a3a569311bd4e653000f08ccb64e925b0e/test_gbinet.py#L75 They are hard coded in our code. You can change the mode by passing arguments through config by modifying the code.

ly27253 commented 11 months ago

Thank you very much for your professional and patient responses. I have studied your code carefully. Made changes to the places you mentioned and the relative positions. in the file "train_gbniet.py " line 65, change to outputs = model(data_batch=sample_cuda, max_tree_depth=max_tree_depth, mode="all", stage_id=stage_id) # ly27253 But in the case of model = ”all" it causes the program to report an error as follows: * -- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/ly/anaconda3/envs/gbinet/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, args) File "/GBi-Net/train_gbinet.py", line 734, in train tensorboard_logger=tensorboard_logger File "/GBi-Net/train_gbinet.py", line 82, in train_model_stage preds = outputs["pred_feature"] KeyError: 'pred_feature' ** I studied the program carefully and checked the code, when model = "all", i.e. using forward_all_depth, the return value of the program is not consistent with forward_one_depth, I am now in trouble and I don't know how to modify it to be completely correct. So I hope you can test if you are experiencing this error in your busy schedule?

I am very sorry to bother you again, thank you again for your selflessness and eagerly await your reply!

MiZhenxing commented 11 months ago

Hi, in our code we actually only support using model = "one" in training and only support model = "all" in testing. This is related to "Memory-efficient Training." in Paper Section 3.5 and Table 6. That is, in the training, we backpropagate the gradient in each binary tree depth to save memory.

ly27253 commented 11 months ago

Thank you very much for your professional and patient reply.

But I have one more query and I hope to continue talking with you. Your project is very fascinating and I would like to learn deeply about your innovative points.

By debugging the code, I observed that the features extracted using DeformStageNet retained only 1/8th of the size of the features and did not use the other three larger sizes of the features (1/4,1/2,1/1). Also, the scale size of the truth value (gt_label) used in the calculation of Loss is only 1/8. May I ask how the relationship of multiple Stage recursion demonstrated in Figure 2 (The multi-stage framework of our GBi-Net.) in the paper is reflected in the code.

I am very sorry to bother you again and look forward to hearing from you if possible. Finally, I wish you all the best.

MiZhenxing commented 11 months ago

Hi, sorry to be late. In Figure 2, you can see an upsample operation from Stage 2 to Stage 3 (We use "stage" to refer binary depth in the paper. "stage" in the paper is the same as the binary tree depth in the code. "stage" in the code means the resolution scale). As you can see in the config file:

image

Each two binary tree depths correspond to a "stage" (scale) of the image. Therefore, the first 2 depths use the feature map of scale 1/8. And other tree depths use the scales from 1/4 to 1/1, defined in the config.

ly27253 commented 11 months ago

Very happy to hear back from you again. Thank you very much for your professionalism and patience in answering my questions. I truly appreciate your selflessness and generosity, and wish you all the best in your future research endeavors!