jayleicn / ClipBERT

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
https://arxiv.org/abs/2102.06183
MIT License
697 stars 85 forks source link

Got 'Resource temporarily unavailable' using docker #32

Open ByZ0e opened 2 years ago

ByZ0e commented 2 years ago

Hi, I always got 'runtime/cgo: pthread_create failed: Resource temporarily unavailable' error when using docker. And the docker process cannot stop itself, I need to use sudo to kill the process, which is very inconvenient. What's more, I found that saving the code and backup checkpoints needs very large memory space(~GB) which may cause the above error. Any suggestions for this error? Thanks a lot!

jayleicn commented 2 years ago

Hi @Zoe-Ziyi,

There is no need to backup checkpoints, our code is only intended to backup the source code. The checkpoint files are rather large ~1-2GB. You should probably move the checkpoint directory out of the source code directory to prevent it from being backup-ed. Hope this helps!

Best, Jie

ByZ0e commented 2 years ago

Thanks for your reply. In fact, I found the real possible cause of this error is that the user's number of processes (can reach 3W+) exceeds the limit. However, my root user's max process number is unlimited. I set the num_worker of the dataloader to 0, which finally solved the problem. Do you have any better solution, please?