takoika / PrioritizedExperienceReplay

Yet another prioritized experience replay buffer implementation.
MIT License
48 stars 12 forks source link

Occasionally selects None #3

Open pender opened 6 years ago

pender commented 6 years ago

Hi, thank you for this clean and elegant implementation! I am encountering an issue where very rarely the prioritized replay buffer will return None, even though there are a bunch of entries with weight > 0. I've noticed this happens when a random r is chosen that is very close to 1.0 -- 0.9999321770440263 in my latest try -- and will always select the None at index number 131071 (which is 2^17-1). I am using a tree of memory size 2^25 (33.5 million), and usually I have about 70,000 positive entries in the buffer at first. Is there a clean solution to this issue, or is this just an unavoidable problem with using such a large tree? As a workaround I'm just having it continue to choose entries until it picks one that is not None but maybe there's a better way. Thanks a lot!