datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.

http://www.rlcard.org

MIT License

2.91k stars 630 forks source link

Process finished with exit code 137 (interrupted by signal 9: SIGKILL) #126

Closed gogov5 closed 3 years ago

gogov5 commented 4 years ago

when I run doudizhu nfsp agent, atfer minutes, got following error. I use tensorflow 1.15 and on Nvidia GPU. I try different batch size 256 64, but still the same error

Process finished with exit code 137 (interrupted by signal 9: SIGKILL)

gogov5 commented 4 years ago

my Nvidia GPU RAM is 8G

daochenzha commented 4 years ago

@gogov5 It could be caused by insufficient memory and be killed by the system. The agent has to maintain a buffer to store past experiences. You could try using a small buffer and see whether it would be killed.

gogov5 commented 4 years ago

@gogov5 It could be caused by insufficient memory and be killed by the system. The agent has to maintain a buffer to store past experiences. You could try using a small buffer and see whether it would be killed.

thank you for your reply. I use same memory size 1000 in both DQN agent and NFSP, but the error only appear in NFSP agent simulation. I will try to follow the replay pool, and feedback valuable infor at this issue

The intial memory size

memory_init_size = 1000

daochenzha commented 4 years ago

@gogov5 You may want to change reservoir_buffer_capacity and q_replay_memory_size instead. These are the buffer for DQN and the average policy respectively.

q_replay_memory_init_size specifies how many data points we need to collect before we start training.

See https://github.com/datamllab/rlcard/blob/master/rlcard/agents/nfsp_agent.py#L38

gogov5 commented 4 years ago

@gogov5 You may want to change reservoir_buffer_capacity and q_replay_memory_size instead. These are the buffer for DQN and the average policy respectively.

q_replay_memory_init_size specifies how many data points we need to collect before we start training.

See https://github.com/datamllab/rlcard/blob/master/rlcard/agents/nfsp_agent.py#L38

yes thanks a lot. I config q_replay_memory_size=int(1e4), #1e5 with 1e4, everything goes well