Closed sachiel321 closed 2 years ago
The above is the equation(10) in the paper, but I can't find it in the current implementation.
In the file
And
I can not find any preprocessing to advantages like equation (10) in your paper.
I would appreciate knowing how iterative updates in the algorithm1 are represented in the code.
I figured it out... Your perpare each agent a trainer in .
@sachiel321 Hi, I cannot find the implementation either. Can you point out where I can find it?
The above is the equation(10) in the paper, but I can't find it in the current implementation.
In the file
And
I can not find any preprocessing to advantages like equation (10) in your paper.
I would appreciate knowing how iterative updates in the algorithm1 are represented in the code.