mindspore-lab / mindone

one for all, Optimal generator with No Exception
Apache License 2.0
330 stars 63 forks source link

Fix bug when ms.set_context(max_device_memory="xxGB") #394

Closed HaoyangLee closed 3 months ago

HaoyangLee commented 3 months ago

What does this PR do?

Fixes # (issue) MindSpore context max_device_memory can NOT be set after init() while running in distributed mode. Otherwise error below raises:

Traceback (most recent call last):
  File "/home/ma-user/modelarts/user-job-dir/mindone/examples/animatediff/train.py", line 510, in <module>
    main(args)
  File "/home/ma-user/modelarts/user-job-dir/mindone/examples/animatediff/train.py", line 188, in main
    max_device_memory=args.max_device_memory,
  File "/home/ma-user/modelarts/user-job-dir/mindone/examples/animatediff/train.py", line 173, in init_env
    ms.set_context(max_device_memory=max_device_memory)
  File "/home/ma-user/anaconda3/envs/ms_animatediff/lib/python3.7/site-packages/mindspore/_checkparam.py", line 1313, in wrapper
    return func(*args, **kwargs)
  File "/home/ma-user/anaconda3/envs/ms_animatediff/lib/python3.7/site-packages/mindspore/context.py", line 1493, in set_context
    ctx.setters[key](ctx, value)
  File "/home/ma-user/anaconda3/envs/ms_animatediff/lib/python3.7/site-packages/mindspore/context.py", line 472, in set_max_device_memory
    self.set_param(ms_ctx_param.max_device_memory, max_device_memory_value)
  File "/home/ma-user/anaconda3/envs/ms_animatediff/lib/python3.7/site-packages/mindspore/context.py", line 175, in set_param
    self._context_handle.set_param(param, value)
TypeError: For 'set_context', the parameter max_device_memory can not be set repeatedly, origin value [1024] has been in effect.

----------------------------------------------------
- C++ Call Stack: (For framework developers)
----------------------------------------------------
mindspore/core/utils/ms_context.cc:477 CheckReadStatus