Closed loserZhang closed 3 years ago
@loserZhang The example should be almost the same as CFR in https://github.com/datamllab/rlcard/blob/master/examples/leduc_holdem_cfr.py, except that we need to input a tf.Session
during initialization.
The structure of DeepCFR is very similar to CFR, but with neural networks as function approximators. For the current DeepCFR, we have tried hard but still could not make it converged. Thus, we did not include it as an example.
If you are interested, you may try different hyperparameters/networks of DeepCFR and let me know if you successfully make it converged :)
@daochenzha Thank you very much. I have another question that when I combine nfsp to multi-process, the variable self._reservoir_buffer is always 0
@loserZhang Thank you for letting us know. It seems not normal. Currently, we only provide an example of parallelization with DQN. You may encounter some bugs when combing multi-process with nfsp.
Do you think it is a good idea to implement a general wrapper for parallelization? We may implement this function in the future.
Thank you, it is my mistake when combining this together, and i have solved it. Do you have some idea of combining mcts with nfsp together? reference: https://arxiv.org/pdf/1903.09569.pdf
Thanks for letting us know. Monte Carlo Tree Search with fictitious self-play for imperfect information game seems to be a promising direction. But dealing with large state/action space is still challenging. Maybe some abstraction techniques would help, such as the following recent papers: [1] https://www.ijcai.org/Proceedings/15/Papers/084.pdf [2] https://ieeexplore.ieee.org/abstract/document/8848034 [3] http://www.csse.uwa.edu.au/cig08/Proceedings/papers/8057.pdf [4] https://core.ac.uk/download/pdf/82710979.pdf
You have implemented deep_cfr algorithm in your code, but there is not an example for it.