datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
http://www.rlcard.org
MIT License
2.86k stars 618 forks source link

Memory skyrocket #168

Closed oohsjp closed 3 years ago

oohsjp commented 4 years ago

When I ran doudizhu_nfsp training, the computer memory increased to 98, and then the computer's virtual memory was set to the maximum value of 130,000, and it would slowly run out and the program reported an error: File "C:\Anaconda3\lib\site-packages\numpy\lib\twodim_base.py", line 202, in eye m = zeros((N, M), dtype=dtype, order=order) numpy.core._exceptions.MemoryError: Unable to allocate array with shape (309, 309) and data type float64 How should I configure it to run correctly

daochenzha commented 4 years ago

@chenzhenfeng-stack Hi, may I know what is your memory size. NFSP will use several neural networks so they may require a larger memory to run. One possible way is to use a smaller network so that they will occupy less memory.

oohsjp commented 4 years ago

@ chenzhenfeng-stack嗨,我可以知道你的内存大小是多少。NFSP将使用多个神经网络,因此它们可能需要更大的内存才能运行。一种可能的方法是使用较小的网络,以便它们占用较少的内存。

16G memory, probably how much memory can run, not occupying virtual memory, you said that the smaller network is to adjust the parameters

daochenzha commented 4 years ago

@ chenzhenfeng-stack嗨,我可以知道你的内存大小是多少。NFSP将使用多个神经网络,因此它们可能需要更大的内存才能运行。一种可能的方法是使用较小的网络,以便它们占用较少的内存。

16G memory, probably how much memory can run, not occupying virtual memory, you said that the smaller network is to adjust the parameters

You may change the number of layers or the size of each layer to make the network smaller. For large network, you may need GPU for training.

zedwei commented 3 years ago

It seems the following line is causing huge memory issue in the reservoir buffer (>10x of what it should take) in my case at least. one_hot = np.eye(len(probs))[np.argmax(probs)]

Not sure if it's related to some GC flaw, but when I changed it to something like below, the memory explosion issue got resolved. one_hot = np.zeros(len(probs)) one_hot[np.argmax(probs)] = 1

It might worth looking into and verify. @daochenzha. Hope it helps.

daochenzha commented 3 years ago

@zedwei Hi! Thanks a lot for pointing this out! Yes, your new implementation is better than our previous one. np.eye will generate a 2-D array. This may not be an issue for games with small action space but could lead to memory issues when the action space is large. I have noted this issue.

Actually, we are working on a new version of the package (which should be released in early June). We will fix this issue in the new version.

daochenzha commented 3 years ago

Issue fixed in the new verision