idstcv / ZenNAS

219 stars 35 forks source link

evolution search speed #8

Closed lyg95 closed 2 years ago

lyg95 commented 3 years ago

The paper says searching cost is 0.5 GPU day with NVIDIA V100 GPU, half precision (FP16), batch size 64. We use supplied script to search network, and set evolution_max_iter=50000, using V100, FP16, batch 64, the whole process spends much time。
for example, using ./Zen_NAS_ImageNet_latency0.8ms..sh to search 0.8ms model, we take 29 hours. loop_count=47000/50000, max_score=308.6, min_score=293.333, time=28.213h loop_count=48000/50000, max_score=308.6, min_score=294.01, time=28.78h loop_count=49000/50000, max_score=308.6, min_score=294.837, time=29.3295h

in paper, evolutionary iterations is 96000, but using 50000, we still cannot achieve 0.5 GPU day to finish search model.

I wonder is there some problems in code or else

MingLin-home commented 3 years ago

Hi lyg95,

The 0.5 GPU day searching cost only measures the zen score computation and excludes other cost such as network construction, data copy, latency benchmark, etc. This is because zen score computation is so fast that even network construction is a significant delay. We assume that one can fully optimize their pipeline in C++ in order to utilize the GPU at 100% efficiency.

For 0.8 ms Zen-NAS, most of the time is spent in benchmarking the latency. As we said in the paper, we used an in-house latency predictor here which is not released publicly. The current script directly benchmarks the latency, which takes 50%-90% searching cost. I am afraid you have to build your own latency predictor in order to enjoy the fast searching speed of Zen-NAS.

On Sun, Sep 12, 2021, 20:23 lyg95 @.***> wrote:

The paper says searching cost is 0.5 GPU day with NVIDIA V100 GPU, half precision (FP16), batch size 64. We use supplied script to search network, and set evolution_max_iter=50000, using V100, FP16, batch 64, the whole process spends much time。 for example, using ./Zen_NAS_ImageNet_latency0.8ms..sh to search 0.8ms model, we take 29 hours. loop_count=47000/50000, max_score=308.6, min_score=293.333, time=28.213h loop_count=48000/50000, max_score=308.6, min_score=294.01, time=28.78h loop_count=49000/50000, max_score=308.6, min_score=294.837, time=29.3295h

in paper, evolutionary iterations is 96000, but using 50000, we still cannot achieve 0.5 GPU day to finish search model.

I wonder is there some problems in code or else

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/idstcv/ZenNAS/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIQVWN7ASWU4GDWXCS37P3UBVVEHANCNFSM5D43FALQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.