idstcv / ZenNAS

219 stars 35 forks source link

Is it only suitable for VCNN network? Does the residual network fit? #25

Closed qq272574497 closed 1 year ago

MingLin-home commented 2 years ago

It works with residual network. However, only VCNN comes with strong theoretical guarantee. That is, you can apply Zen-score to any network, even non VCNN ones. You loose theoretical guarantee so be careful to verify the results.

qq272574497 commented 2 years ago

It works with residual network. However, only VCNN comes with strong theoretical guarantee. That is, you can apply Zen-score to any network, even non VCNN ones. You loose theoretical guarantee so be careful to verify the results.

What are the steps to run my own network structure in your code? For example:

simpleNet( (layer1): Sequential( (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (3): ReLU() ) (layer2): Sequential( (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (3): ReLU() ) (layer3): Sequential( (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (3): ReLU() ) (dropout): Dropout(p=0.5, inplace=False) (fc): Linear(in_features=1024, out_features=10, bias=True) (out): Linear(in_features=10, out_features=10, bias=True) )

MingLin-home commented 2 years ago

Please check https://github.com/idstcv/ZenNAS/blob/d1d617e0352733d39890fb64ea758f9c85b28c1a/ZeroShotProxy/compute_zen_score.py#L33

for computing zen-score for you model.

MingLin-home commented 1 year ago

Sorry for the late reply. Please refer to line 48-67 in compute_zen_score.py for your resnet model.

On Mon, Oct 10, 2022 at 3:06 AM qq272574497 @.***> wrote:

Please check

https://github.com/idstcv/ZenNAS/blob/d1d617e0352733d39890fb64ea758f9c85b28c1a/ZeroShotProxy/compute_zen_score.py#L33

for computing zen-score for you model.

hi, How to run this demo([ZenNAS/ZeroShotProxy/compute_zen_score.py]). i think it need sys.argv and i want to run resnet

— Reply to this email directly, view it on GitHub https://github.com/idstcv/ZenNAS/issues/25#issuecomment-1273076896, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIQVWPMCLTJIKWLVOKLX5DWCPTBTANCNFSM6AAAAAAQ2OOCJI . You are receiving this because you commented.Message ID: @.***>

qq272574497 commented 1 year ago

Excuse me, I want to run the demo (https://github.com/idstcv/ZenNAS/blob/main/ZeroShotProxy/compute_zen_score.py), could you tell me the following parameters are correct? Take myresnet50 and cifir10 as examples

--batch_size 64 --input_image_size 32 --repeat_times 32 --mixup_gamma 1e-2 --arch myresnet50 --num_classes 10

MingLin-home commented 1 year ago

--mixup_gamma 1e-2 : I would suggest 1e-4~1e-6, to approximate gradient. 1e-2 might be too large.

qq272574497 commented 1 year ago

It works with residual network. However, only VCNN comes with strong theoretical guarantee. That is, you can apply Zen-score to any network, even non VCNN ones. You loose theoretical guarantee so be careful to verify the results.

in [Zero-Cost Proxies for Lightweight NAS],reslink was deleted in the search stage, and the residual block was used after the optimal architecture was found. Is that the same with you?

MingLin-home commented 1 year ago

Yes, they are the same.

qq272574497 commented 1 year ago

I find that pooling has a great impact on the result. Is there any way to reduce the impact of pooling on the result of zero-cost to 0? The only use of pooling is to reduce calculations

MingLin-home commented 1 year ago

You can use conv with stride=2 to replace pooling. The saving on the computational cost is minor, because there are at most 5 such layers.

qq272574497 commented 1 year ago

You can use conv with stride=2 to replace pooling. The saving on the computational cost is minor, because there are at most 5 such layers.

stride=2 does not work;But I agree with you " The saving on the computational cost is minor"