Feat/remove agent vmapping

What?

Remove vmaping manually over agents and instead rely on Flax's autobatching since parameters are shared.

Why?

Since we return distributions which can be sampled from our policies, we ran into issues with both Distrax and Tensorflow probability where the creation of distributions for continuous action space policies cannot be vmaped. To keep things consistent across system we opted to remove vmaping over agents.

instadeepai / Mava

Feat/remove agent vmapping #1038

What?

Why?