tensorflow / recommenders-addons

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
Apache License 2.0
593 stars 136 forks source link

keras-horovod运行报错 #464

Open lixiang-repo opened 2 months ago

lixiang-repo commented 2 months ago

System information

Describe the bug 运行下面能正常训练 horovodrun -np 1 python test.py --mode="train" --model_dir="./model_dir" --export_dir="./export_dir" 但是-np改成2个以上就会报错

Other info / logs log.txt

MoFHeka commented 2 days ago

从报错信息上看可以检查一下是不是内存不足被系统killed