Closed MilPat555 closed 6 years ago
I have tried increasing the batch_size and playing with the value for use_multi_process_num for KittiLoader, but not sure why each process is limited to 58mb still. Processes seem to be put on the CPU for some reason.
It is not so strange that each process consumes only a little memory. In fact, I've tried to run 1 batch in single TITAN X, it only cost about 109MB for each process. As for you suspect that the computation may be put on CPU, I'm not sure about that since I do not have more information, but I suggest that you can have a look on the GPU-Util
of the GPU you use to make sure if the model utilizing the GPU.
@MilPat555, did you make any other alterations for single GPU. I also changed to
__C.GPU_AVAILABLE = '0'
and it grabs it just fine
2018-04-17 18:03:26.479067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:84:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
but I get an error later
Traceback (most recent call last):
File "/home/b.weinstein/voxelnet/train.py", line 200, in <module>
tf.app.run(main)
File "/home/b.weinstein/miniconda3/envs/voxelnet/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/b.weinstein/voxelnet/train.py", line 129, in main
batch = sample_test_data(val_dir, args.single_batch_size * cfg.GPU_USE_COUNT, multi_gpu_sum=cfg.GPU_USE_COUNT)
File "/home/b.weinstein/voxelnet/utils/kitti_loader.py", line 140, in sample_test_data
_, per_vox_feature, per_vox_number, per_vox_coordinate = build_input(voxel[idx * single_batch_size:(idx + 1) * single_batch_size])
File "/home/b.weinstein/voxelnet/utils/kitti_loader.py", line 172, in build_input
feature = np.concatenate(feature_list)
ValueError: need at least one array to concatenate
which suggests to me it assume that there atleast two GPUs. It seems to work fine debugging on CPU.
I'm using a fork on this repo, still working back through the PR to see if there has been a change upstream.
Followup here, this was caused by misnaming the validation directory.
Hey,
So I am trying to train on a single GPU of 11GB. In the config.py file I changed __C.GPU_AVAILABLE = '3,1,2,0' to __C.GPU_AVAILABLE = '0'
Due to your multiprocessing which you have implemented it breaks by processes up in 16 parts of only 58MB each. Though I have plenty more GPU available, it does not seem to access it.
Do you know how I can solve this?