HewlettPackard / dlcookbook-dlbs

Deep Learning Benchmarking Suite
https://www.hpe.com/software/dl-cookbook
Apache License 2.0
130 stars 51 forks source link

remove deprecated parameter shared_param #14

Open YYStreet opened 5 years ago

YYStreet commented 5 years ago

In the latest version of apex, shared_param is no longer supported as an option.

Error message:

Critical error while running benchmarks (shared_param is no longer supported as an option.  It was misleadingly named from the start.  It turns out overlapping communication with computation should work fine with shared parameters.  If you still wish to delay communication to the end of the backward pass, use delay_allreduce=True|False instead.). See stacktrace below.
Traceback (most recent call last):
  File "dlcookbook-dlbs/python/pytorch_benchmarks/benchmarks.py", line 411, in main
    model_title, times = benchmark(opts)
  File "dlcookbook-dlbs/python/pytorch_benchmarks/benchmarks.py", line 93, in benchmark
    return benchmark_training(model, opts)
  File "dlcookbook-dlbs/python/pytorch_benchmarks/benchmarks.py", line 213, in benchmark_training
    model = DDP(model, shared_param=True)
  File "/opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6.egg/apex/parallel/distributed.py", line 190, in __init__
    raise ValueError("shared_param is no longer supported as an option.  It was misleadingly named from the start.  It turns out overlapping communication with computation should work fine with shared parameters.  If you still wish to delay communication to the end of the backward pass, use delay_allreduce=True|False instead.")
sergey-serebryakov commented 5 years ago

Thanks for reporting this. I'll test it next week. Sergey.