Tomeu7 / AO-MARL

Implementation of Adaptive Optics control with Multi-Agent Model-Free Reinforcement Learning
GNU General Public License v3.0
11 stars 0 forks source link

4woker ,the last one train failure #1

Open nananana1988 opened 1 hour ago

nananana1988 commented 1 hour ago

Four wokers participated in the training, three rewards continued to increase, the fourth one decreased, and the loss of the fourth worker did not decrease but instead increased, rising to the power of 10 ^ 33

nananana1988 commented 1 hour ago

I want to know why did this situation occur and how should I solve it