This is a framework for the research on multi-agent reinforcement learning and the implementation of the experiments in the paper titled by ''Shapley Q-value: A Local Reward Approach to Solve Global Reward Games''.
Can l use code 'value = self.value(state_, action_pol)' calculate to the credit assignment or Shapley Q-value of each agent?
And the sample size need to be set 1.
Deer author,
Can l use code 'value = self.value(state_, action_pol)' calculate to the credit assignment or Shapley Q-value of each agent? And the sample size need to be set 1.
Thank you.