yeyupiaoling / PPASR

基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
Apache License 2.0
797 stars 131 forks source link

多卡训练会卡吗? #95

Closed qinwangmm closed 1 year ago

qinwangmm commented 1 year ago

拿小数据集data_thchs30测试双卡训练,为啥workerlog.1不打日志啊?

Input size (MB): 0.55
Forward/backward pass size (MB): 67.88
Params size (MB): 134.09
Estimated Total Size (MB): 202.52
------------------------------------------------------------------------------------------------------

[2022-07-29 16:56:23.756062] 训练数据:13254
[2022-07-29 16:56:25.356384] 成功恢复模型参数和优化方法参数:models_thchs/deepspeech2/last_model

workerlog.0正常打印
 ======================================================================
[2022-07-29 17:07:20.092529] Test batch: [0/3], loss: 235.28400, cer: 0.97384
[2022-07-29 17:07:20.964818] Test epoch: 17, time/epoch: 0:01:16.713099, loss: 204.11594, cer: 0.97387
====================================================================== 

[2022-07-29 17:07:24.003397] 已保存模型:models_thchs/deepspeech2/epoch_17
[2022-07-29 17:07:27.692319] Train epoch: [18/63], batch: [0/103], loss: 227.22951, learning rate: 0.00014561, eta: 2:07:36
yeyupiaoling commented 1 year ago

workerlog.1 一般不会输出训练日志的,除非是对应的卡报错

qinwangmm commented 1 year ago

嗯呢,我测试了一下,workerlog.0结束了,GPU 占用全部也释放了,我不太确定,就问问你,谢谢