Parallelising image rendering of multiple sensors

rishubhsingh commented 11 months ago

I was looking into parallelising the image rendering for multiple sensors in simulator.py, however, pickling doesn't work and there isn't support for async rendering either. Is there a way rendering for multiple sensors of an agent can be parallelised?

aclegg3 commented 11 months ago

@rishubhsingh I'm curious what you hoping to gain here?

Visual sensor rendering uses GPU operations to rasterize the scene. Without implementing something more complex on the shader side or leveraging multiple GPUs for a single simulator process, I don't know how you would intend to parallelize these rendering passes.

Curious what you have in mind.

rishubhsingh commented 11 months ago

@aclegg3 I am trying to work with agents with multiple sensors (upto 20 rgb sensors) with different positions and orientations. The current implementation of get_sensor_observations() is to loop over all agents and all sensors within each agent.

I want to leverage multiple gpus to parallelize this rendering by splitting the loop into parallel processes however the class objects necessary cannot be pickled (even by dill) to allow for multiprocessing. Additionally, the notes for batch rendering say it only supports 1 sensor right now which doesn't solve this problem.

I also tried to use start_async_render() and get_sensor_observations_async_finish() instead of get_sensor_observations() which to my understanding uses multithreading at the cpp level, however a simple verification setup emperically showed it isn't faster (but is unintuitively slower). Does this setup actually use multithreading or is the intended use case something else?

The primary need is to parallelise rendering for multiple sensors within a single environment (assuming having the needed resources) and I haven't been able to do it -- is there a suggested solution / existing function or setup that can be used?

aclegg3 commented 10 months ago

Hey @rishubhsingh,

I understand your objective a bit better now. I agree that rendering 20+ sensors per observation is likely to be significant bottleneck.

The best way to do this would be to leverage something like multi-draw where scene contents remain loaded in the GPU shader context and only the camera matrix is modified allowing quick multi-view rendering a single scene state. We are looking into features like this with the batched renderer, but have not made it that far yet.

Spreading the load across multiple GPUs sounds like an attractive option, but remember that asset memory is loaded per-gpu. This means all textures, meshes, etc... need to be uploaded for each scene to each GPU. The resulting image observations would then need to be piped to a learner process which is doing inference. I think we may find that the device-to-device copy overhead is also a significant bottleneck.

Regarding async rendering. This feature is intended to overlap physics and rendering in order to reduce the overhead of running dynamic simulation. E.g. render the previous state while running physics to produce the next state. In Habitat 2.0 we showed that this is a significant improvement for dynamic tasks. However, if batch rendering is the bottleneck already, this won't help you much.

facebookresearch / habitat-sim

Parallelising image rendering of multiple sensors #2217