mindspore-ai / mindspore

MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
https://gitee.com/mindspore/mindspore
Apache License 2.0
4.16k stars 686 forks source link

Pynative Train,OOM! #280

Open Enlion91 opened 2 months ago

Enlion91 commented 2 months ago

Environment

Hardware Environment(Ascend/GPU/CPU): Ascend910A atlas800-9000 风冷

Describe the current behavior

mindyolo 使用 pynative 模式启动训练,OOM,切换成graph 模式无此问题

Describe the expected behavior

pynative 模式能够正常训练

Steps to reproduce the issue

  1. 使用pynative模式训练

Related log / screenshot

image

Special notes for this issue

Enlion91 commented 2 months ago

如果是显存不足,应该在启动第一个epoch 就报,但是持续若干epoch才退出,推测pynative模式存在内存泄漏可能。

Ash-Lee233 commented 1 month ago

您好,mindyolo默认是使用静态图模式运行的,pynative模式并未充分测试,可能会出现一些问题,建议您尝试降低batch size