Closed JuanFMontesinos closed 1 year ago
Hola Juan,
Encantado! Es verdad que no somos muchos o al menos no solemos hacernos notar demasiado.
Those lines indeed implement gradient accumulation. trajectories_per_gpu_call
is the batch size for GPU computation while trajectories_per_batch
is the batch size for gradient calculation and optimization. This is not needed in most GPUs when working with maps of resolution r=2
, but if you increase it you might find it necessary since the memory consumption grows quite quickly with the map resolution.
You can increase trajectories_per_gpu_call
to 5 in order to increase the GPU usage, though the training time is principally limited by the reverberation simulation in the dataloader, so you won't probably get the speedups you expect by increasing the GPU usage in the forward pass of the network. Anyway, any value of trajectories_per_gpu_call
should work fine as long as it is a divisor of trajectories_per_batch
.
Best, David
Thanks! I have some more questions but will open a new issue instead.
Hola, Un placer conocer a alguien de mi Alma Mater.
EDIT torefactor the question
First of all congrats for the paper, seems really nice. I was checking the code, porting it to pytorch lightning. I've run the code
1sourceTracking_icoCNN.py
and gave me 28% of GPU utilization with ~ 2Gb of VRAM. In lines https://github.com/DavidDiazGuerra/icoDOA/blob/04d1a89594c78ae3cf42f07d94c3737bdc1f7c82/acousticTrackingLearners.py#L140-L142You seem to have coded gradient accumulation which depends on these lines https://github.com/DavidDiazGuerra/icoDOA/blob/04d1a89594c78ae3cf42f07d94c3737bdc1f7c82/1sourceTracking_icoCNN.py#L83-L84
Can you explain the role of
So far it seems you cannot run a batched forward and you defined
trajectories_per_batch
to emulate batches by doing gradient accumulation. What is then the role oftrajectories_per_gpu_call
? Can I increase somehow the GPU usage?Best, Juan