Open XiaoXiao-Woo opened 1 year ago
I'm also getting this error consistently when the number of trials in an experiment gets about about 40,000. Has happened on > 5 different experiments.
NNI version: master
Training service (local|remote|pai|aml|etc): local, and reusemode=False
Client OS: ubuntu 22.04.2 LTS
Server OS (for remote mode only): n/a
Python version: 3.7.4
PyTorch/TensorFlow version: n/a
Is conda/virtualenv/venv used?: conda
Is running in Docker?: no
You can set export NODE_OPTIONS="--max_old_space_size=8192"
for a quick fix.
ah, thanks for the suggestion! It spools up the run when I resume but immediately fails on me silently
any progress on this?
checking in again
You can set
export NODE_OPTIONS="--max_old_space_size=8192"
for a quick fix.
unfortunately, still getting the heap out of memory error after 29k trials. this is in nni 3.0
Besides, when I use nni to connect another machine (it can connect itself with "remote" platform), the same problem occurs: "FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory"
Environment:
Configuration:
Log message:
How to reproduce it?: