Eclectic-Sheep / sheeprl

Distributed Reinforcement Learning accelerated by Lightning Fabric
https://eclecticsheep.ai
Apache License 2.0
303 stars 31 forks source link

Agent (Player) unwrapped from the `_FabricModule` does not consider the chosen precision during env interaction #236

Closed belerico closed 6 months ago

belerico commented 6 months ago

Every time, during the environment interaction, the we call agent.module to unwrap the agent from the distributed strategy, we also unwrap the agent from the precision plugin, this means that if we are training an agent with float16 or bfloat16 then the environment interaction happens in float32.

I suggest to wrap every player agent with a _FabricModule, i.e. _FabricModule(agent, precision=fabric.precision) so to unwrap the agent from the strategy but maintaining the precision plugin.

cc @michele-milesi