There are three operation related replaybuffer
Actor: enqueue
Leaner : sample_proportional_from_buffer() sample transitions from replay buffer
Learner: replay_buffer.assign(idxs, new_priorities) update priorities
I wonder that: How does replaybuffer ensure multithread safety?
idxs may have been replaced since the actor performs enqueue operations
There are three operation related replaybuffer
Actor
: enqueueLeaner
:sample_proportional_from_buffer()
sample transitions from replay bufferLearner
:replay_buffer.assign(idxs, new_priorities)
update prioritiesI wonder that: How does
replaybuffer
ensure multithread safety? idxs may have been replaced since theactor
performs enqueue operations