experience_buffer() - sample: "ValueError: Sample larger than population or is negative"

vincentberaud / Minecraft-Reinforcement-Learning

Deep Recurrent Q-Learning vs Deep Q Learning on a simple Partially Observable Markov Decision Process with Minecraft

49 stars 6 forks source link

Hi,

I'm glad you've resolved your previous issue.

For this one, as I understand by reading your error, you've set the NetType to "Convolutional". Thus, you are using the "experience_buffer" to store steps. At each timestep you want to train your network (set by "update_freq"), you sample from this experience buffer a sample of size "batch_size" (if you haven't changed it, it must be 32). But you are starting to train your network only once "pre_train_steps" has been reached.

For your case, this means that the first time you try to get 32 steps from the experience_buffer, it only has 10 steps stored (as pre_train_steps has been reached).

I believe you can solve this problem by setting batch_size >= pre_train_steps (either by increasing pre_train_steps or by decreasing batch_size).

I hope I'm clear enough in these explanations. If not, just tell me.

Clément

vincentberaud / Minecraft-Reinforcement-Learning

experience_buffer() - sample: "ValueError: Sample larger than population or is negative" #6