DavidDiazGuerra / icoDOA

Code repository for the paper Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs
GNU Affero General Public License v3.0
29 stars 9 forks source link

Gradient accumulation #1

Closed JuanFMontesinos closed 1 year ago

JuanFMontesinos commented 1 year ago

Hola, Un placer conocer a alguien de mi Alma Mater.

EDIT torefactor the question

First of all congrats for the paper, seems really nice. I was checking the code, porting it to pytorch lightning. I've run the code 1sourceTracking_icoCNN.py and gave me 28% of GPU utilization with ~ 2Gb of VRAM. In lines https://github.com/DavidDiazGuerra/icoDOA/blob/04d1a89594c78ae3cf42f07d94c3737bdc1f7c82/acousticTrackingLearners.py#L140-L142
You seem to have coded gradient accumulation which depends on these lines https://github.com/DavidDiazGuerra/icoDOA/blob/04d1a89594c78ae3cf42f07d94c3737bdc1f7c82/1sourceTracking_icoCNN.py#L83-L84

Can you explain the role of

trajectory_idx = 0
trajectories_per_batch = 5
trajectories_per_gpu_call = 1

So far it seems you cannot run a batched forward and you defined trajectories_per_batch to emulate batches by doing gradient accumulation. What is then the role of trajectories_per_gpu_call? Can I increase somehow the GPU usage?

Best, Juan

DavidDiazGuerra commented 1 year ago

Hola Juan,

Encantado! Es verdad que no somos muchos o al menos no solemos hacernos notar demasiado.

Those lines indeed implement gradient accumulation. trajectories_per_gpu_call is the batch size for GPU computation while trajectories_per_batch is the batch size for gradient calculation and optimization. This is not needed in most GPUs when working with maps of resolution r=2, but if you increase it you might find it necessary since the memory consumption grows quite quickly with the map resolution.

You can increase trajectories_per_gpu_call to 5 in order to increase the GPU usage, though the training time is principally limited by the reverberation simulation in the dataloader, so you won't probably get the speedups you expect by increasing the GPU usage in the forward pass of the network. Anyway, any value of trajectories_per_gpu_call should work fine as long as it is a divisor of trajectories_per_batch.

Best, David

JuanFMontesinos commented 1 year ago

Thanks! I have some more questions but will open a new issue instead.