wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
3.87k stars 1.04k forks source link

[Fix #2506] Specify multiprocessing context in DataLoader #2507

Closed MengqingCao closed 2 months ago

MengqingCao commented 2 months ago

Fix #2506

P.S. The indentation adjustment of the code is to pass the format check of yapf.

xingchensong commented 2 months ago

thx!

xingchensong commented 1 month ago

Sorry , I have to revert this PR, I cannot laungh deepspeed engine after this PR.

MengqingCao commented 1 month ago

Sorry , I have to revert this PR, I cannot laungh deepspeed engine after this PR.

Hi @xingchensong, could you describe in more detail the error you get when launching deepspeed? If it is solvable, I would like to do this work and remerge this pr if possible.

xingchensong commented 1 month ago

Support for the KUNPENG CPU is not a top priority; you can treat it as a patch and explore DeepSpeed issue when using an Intel CPU.

MengqingCao commented 1 month ago

Support for the KUNPENG CPU is not a top priority; you can treat it as a patch and explore DeepSpeed issue when using an Intel CPU.

I tried to run training pipeline in aishell/whisper/run.sh with deepspeed, using mpDataLoader as a patch. Nothing went wrong with it. Maybe more details for reproducing the error you met could help.

The cpu test with:

Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
xingchensong commented 1 month ago

我在使用自己diy代码,抄本在线hotfix (https://github.com/wenet-e2e/WenetSpeech/discussions/54 ),deespeed会一直重复进行初始化

https://paste.ubuntu.com/p/4FmD3342Dv/

image

听你描述,应该去掉这段diy代码可以正常跑,没有时间探究原因,你有时间可以尝试解决下

MengqingCao commented 1 month ago

听你描述,应该去掉这段diy代码可以正常跑,没有时间探究原因,你有时间可以尝试解决下

ok 我尝试下