May be more efficient in Training in terms of GPU utils?

I'd like to ask whether there may be some method to facilitate GPU utility in training? For the default config 8GPU* batch_per_gpu=2，I use 8xA6000，which takes 6.s per info and 0.3s for data_time. The time is twice the usage of your A100 case. But I find that the GPU utils is around 50% (but not above 80% in most mmdet3d training process), maybe the speed is blocked by CPU. Is the nerf process takes more time somewhere? Or is the data loading IO too slow? May you give some instructions? Thanks.

pmj110119 / RenderOcc

May be more efficient in Training in terms of GPU utils? #27