Open Benzlxs opened 5 years ago
the spconv use dense table to determine if a location is active. for example, with 0 ~ 70.4, -40 ~ 40, -3 ~ 1 and voxel size [0.025, 0.025, 0.05], we need 2800 x 3200 x 80 x 4 = 3GB memory per example (but your memory usage is still too high). you can pre-allocate a grid buffer and use:
# __init__
self.grid = torch.full([self.max_batch_size, *sparse_shape], -1, dtype=torch.int32).cuda() # must fill with -1
# forward
x = SparseConvTensor(...)
x.grid = self.grid
for all sparse convolutions with this sparse tensor, they will use this pre-allocated dense table.
Note: the memory usage in config file is wrong. I can use batch_size=8 with car.fhd.config with 11GB memory.
Thank you for your reply.
I followed your suggestions and pre-allocated dense table, but memory consumption is similar.
My self.sparse_shape is [ 127, 3199, 2815], so 12731992815*4 = 4.365 Gb, this number is correct, when I check the GPU memory consumption with nvidia-smi
command line. I tried to check step by step, and found that middle.py
takes up around 1.0 Gb memory, and rpn.py
comsume 1.586 GB memory. So sum of them are almost to 7 GB, which seems make sense?
Hey Yan,
I am trying to decrease the voxel size from
[0.05, 0.05, 0.1]
(your original setting) to[0.025, 0.025, 0.0325 ]
, and themax_number_of_points_per_voxel
is also decreased from5
to2
. The training procedure is smoothing, but there is GPU memory bottleneck. Whenbatch_size=1
, the GPU memory consumption is9851MB
, compared to your7633MB
GPU memory whenbatch_size=3
. I feel it does not make sense, but cannot figure out the reason.Do you have any idea about this issue in GPU usage?