Discussion: Consider learning rate 0.01 as a default setting for the combination AUCM x PESG

kdg1993 commented 1 year ago

What

Change the default learning rate setting of combination AUCM x PESG as 0.01. In detail, the learning rate of PESG

Why

Fundamentally, it is important because of the high performance of AUCM x PESG
Empirically, the learning rate affects the performance a lot
Despite that learning rate 0.1 is good, it seems too high compare to the default learning rate of Adam
Learning rate 0.01 was better than 0.1 for both 224x224 & 512x512 images (CheXpert)
The validation loss curves of learning rate 0.01 show less fluctuation
Experiment results were quite clear that 0.01~0.1 could be a sweet spot

Experiment result : https://wandb.ai/snuh_interns/kdg_aucm_pesg_lr_test_w_img_size/table?workspace=user-snuh_interns

Fig. 1 Experiments and best validation scores (descending order) Fig. 2 Validation loss & best validation score curves of two different learning rate based on 512 images Fig. 3 Validation loss & best validation score curves of two different learning rate based on 224 images

How

seoulsky-field commented 1 year ago

I truly agree that the result of lr=0.01 is higher than the result of lr=0.1. Especially, swin transformers didn't converge when I use lr=0.1. (lr=0.01 could converge)

However, before the change, we should come to conclusion about the basic config's purpose. Our default configs reference their general usage such as Adam(lr=0.0001). In this perspective, I agree to change the default PESG learning rate to 0.01.

By the way, we set default learning rate 0.0001 in config/config.yaml. Then you want to change this?

Hoon-Hoon-Tiger commented 1 year ago

As a result of a high validation score when the learning rate was set to a value in the range of 0.1 to 0.01, when using AUCM x PESG, it is more effective to set the default value of 0.01 as the learning rate value instead of the lr = 0.0001 used in Adam. Personally, when setting the learning rate, I have experienced that the appropriate learning rate was different for each optimizer. So I agree to change the default PESG learning rate to 0.01.👍

2023년 2월 20일 (월) 오후 3:52, Kyoungmin Jeon @.***>님이 작성:

I truly agree that the result of lr=0.01 is higher than the result of lr=0.1. Especially, swin transformers didn't converge when I use lr=0.1. (lr=0.01 could converge)

However, before the change, we should come to conclusion about the basic config's purpose. Our default configs reference their general usage such as Adam(lr=0.0001). In this perspective, I agree to change the default PESG learning rate to 0.01.

By the way, we set default learning rate 0.0001. Then you want to change this?

— Reply to this email directly, view it on GitHub https://github.com/seoulsky-field/CXRAIL-dev/issues/100#issuecomment-1436421191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXKN7AK4USJLOASJAIBFFPLWYMIBNANCNFSM6AAAAAAVBOPNM4 . You are receiving this because you were assigned.Message ID: @.***>

kdg1993 commented 1 year ago

Thanks to both @seoulsky-field @Hoon-Hoon-Tiger for the fast and valuable feedback.

To be clear, I think letting the default config's value unchanged is okay unless we select AUCM and PESG as our default loss and optimizer. However, for our common experimental setting when using AUCM x PESG, it could be better to fix lr=0.01 as a common factor because of the integration of experimental focus and performance issue(+swin convergence issue also).

Thus, in conclusion, I suggest doing experiments with lr = 0.01 when using AUCM x PESG for reporting

seoulsky-field commented 1 year ago

@kdg1993 I agree your opinion. However, there is a problem. We have already proceeded experiments not only CheXpert but also MIMIC-CXR. And lots of experiments already did with AUCM x PESG with lr=0.1.

So, we should some experiments again. Then, could you re-experiment them with me?

kdg1993 commented 1 year ago

Absolutely sure! I will find the empty space and run it

seoulsky-field commented 1 year ago

Thanks. Let's do this after other experiments done. (CheXpert, MIMIC and BRAX) I'll apply lr=0.01 after this discussion.

seoulsky-field / CXRAIL-dev

Discussion: Consider learning rate 0.01 as a default setting for the combination AUCM x PESG #100

What

Why

How