Questions about implementation detail

yilawu commented 2 years ago

hello , I have some questiones about implementation details.

Data are obtained using the HR-LR data pairs obtained by the down-sampling code provided in BasicSR. The training data was DF2K (900 DIV2K + 2650 Flickr2K), and the test data was Set5.

I run this command to prune the EDSR_16_256 model to EDSR_16_48. Only the pruning ratio and storage path name are modified compared to the command provided by the official.

Prune from 256 to 48, pr=0.8125, x2, ASSL

python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data \ --data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256 \ --method ASSL --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail* \ --same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001 \ --update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt \ --same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_ASSL_0.8125_RGP0.0001_RUL0.5_Pretrain_06011101 Results model_just_finished_prune ---> 33.739dB fine-tuning after one epoch ---> 37.781dB fine-tuning after 756 epoch ---> 37.940dB

The result (37.940dB) I obtained with the code provided by the official is still a certain gap from the result in the paper (38.12dB). I should have overlooked some details. # I also compared L1-norm method provided in the code. Prune from 256 to 48, pr=0.8125, x2, L1

python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data \ --data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256 \ --method L1 --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail* \ --same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001 \ --update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt \ --same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_L1_0.8125_06011101

Results

model_just_finished_prune ---> 13.427dB fine-tuning after one epoch ---> 33.202dB fine-tuning after 756 epoch ---> 37.933dB

The difference between the results of L1-norm method and those of ASSL seems negligible at this pruning ratio (256->48) #

Is there something I missed? Looking forward to your reply! >-<

yilawu commented 2 years ago

I got some guidance from the author on the details, and I go on with the experiment. Thanks very much !!!

YouCaiJun98 commented 2 years ago

I got some guidance from the author on the details, and I go on with the experiment. Thanks very much !!!

Hello, I ran into the same issue, where the results between ASSL and L1-pruning are quite similar. I wonder how did you solve this and if the ASSL result distinguished itself, thanks in advance!

yumath commented 1 year ago

I got some guidance from the author on the details, and I go on with the experiment. Thanks very much !!!

Hello, I ran into the same issue, where the results between ASSL and L1-pruning are quite similar. I wonder how did you solve this and if the ASSL result distinguished itself, thanks in advance!

@wurongyuan @MingSun-Tse Hello，I also have same question. And I find that the result of ASSL and L1 pruning is similar to EDSR_16_49 model training from scratch, is there something wrong in initialization or params copy?

MingSun-Tse / ASSL

Questions about implementation detail #3