PaddlePaddle / PARL

A high-performance distributed training framework for Reinforcement Learning
https://parl.readthedocs.io/
Apache License 2.0
3.22k stars 817 forks source link

[feature request] save original model in Distributed Data Parallel mode #1022

Open TomorrowIsAnOtherDay opened 1 year ago

TomorrowIsAnOtherDay commented 1 year ago

We are now developing transformer-based algorithms for offline agents. Such algorithms often require a large amount of expert data as demonstration for training. To cope with large-scale data, a common solution is to employ the Distributed Data Parallel (DDP) mode to accelerate training. As discussed in here, after we wrap the model with DDP wrapper, the model parameters saved by calling save_state_dict() cannot be directly loaded by original model. Official torch developers suggest fetching the original module and saving its parameters. Examples can be found here.

The PARL agents must support this feature for both save and restore functions.

TomorrowIsAnOtherDay commented 1 year ago

We will also support this feature in paddle-based agents.