Open ryanfb opened 8 years ago
Same problem here. Just removing -j$(jopts)
is enough to avoid the fork bomb ;-)
EDIT: After the comment of Soumith, this problem can be related with memory, not fork bomb at all. It makes sense with my observations.
hmmm, that's so weird that that introduces a fork-bomb.
I wonder if it's a fork-bomb or just an out-of-memory slowdown. the nvcc processes that compile cutorch do take quite a bit of memory.
Hi @soumith , probably you are right and it is an out-of-memory slowdown. Looking memory usage during compilation I observe how it increases until the computer was unresponsive. It can worth the effort to control -j$(jopts)
parameter depending on the available memory... What do you think?
This is definitely a fork-bomb. I run Bash for Windows (Ubuntu 14.04) and have been watching system processes and resources as this runs. As with others, it hangs on building one of the NVCC objects: e.g. lib/THC/CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathCompareTDouble.cu.o
Memory is sitting at a cool 15% (64GB) but CPU rams into 99% and I can hear my fan chugging. Task Manager shows a forking series of processes all associated with the Torch install (Dash, cudafe, nvcc, cmake, etc.) until I hit that CPU ceiling and start to hang. I have modified the rockspec file as suggested, and will see if that worked.
As part of the Torch install process on my machine (OS X 10.11.6), when the cutorch install is started, I see:
If I'm lucky, this causes:
And a
make
failure. If I'm unlucky, this causes my machine to lock up completely due to a fork bomb.The workaround I'm using for this is to modify
extra/cutorch/rocks/cutorch-scm-1.rockspec
and replace$(MAKE) -j$(jopts) install
and$(MAKE) install
with a plainmake install
(maybe I could have gotten away with just stripping-j$(jopts)
, but I'm a little cautious after fork bombing myself so many times in a row). This may be a regression introduced by 37373ebea1cce61c29b5da6a34af0303b4b4976f because I've installed torch/cutorch on this machine many times in the past without this issue.