bytedance / byteps

A high performance and generic framework for distributed DNN training
Other
3.62k stars 487 forks source link

A segmentation fault occurs when compressor is used. #373

Open showerage opened 3 years ago

showerage commented 3 years ago

Describe the bug I try to run the train_gluon_mnist_byteps_gc.py in the examples. It works well if the compressor is none. As shown in the following figure, a segmentation fault occurs when compressor is used. image

To Reproduce Steps to reproduce the behavior:

  1. bpslaunch python ./train_gluon_mnist_byteps_gc.py --compressor onebit

Environment (please complete the following information):

ymjiang commented 3 years ago

@vycezhong : Could you please take a look?

jasperzhong commented 3 years ago

Have you export BYTEPS_THREADPOOL_SIZE=16 ?

showerage commented 3 years ago

Have you export BYTEPS_THREADPOOL_SIZE=16 ?

Thanks for your help. The problem has been solved. By the way, could you explain the cause of the segmentation fault?