AI4Finance-Foundation / ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥
https://ai4finance.org
Other
3.74k stars 850 forks source link

Implementation bug in Prioritized Experience Replay #329

Open ModernGangster opened 1 year ago

ModernGangster commented 1 year ago

File "/home/moderngangster/Codes/APC-Flight/ElegantRL/examples/../elegantrl/agents/AgentSAC.py", line 43, in update_net obj_critic, state = self.get_obj_critic(buffer, self.batch_size) File "/home/moderngangster/Codes/APC-Flight/ElegantRL/examples/../elegantrl/agents/AgentSAC.py", line 81, in get_obj_critic_per states, actions, rewards, undones, next_ss, is_weights, is_indices = buffer.sample_for_per(batch_size) File "/home/moderngangster/Codes/APC-Flight/ElegantRL/examples/../elegantrl/train/replay_buffer.py", line 134, in sample_for_per _is_indices, _is_weights = sum_tree.important_sampling(batch_size, beg, end, self.per_beta) File "/home/moderngangster/Codes/APC-Flight/ElegantRL/examples/../elegantrl/train/replay_buffer.py", line 267, in important_sampling assert 0 <= i

ChenqiuXD commented 10 months ago

I fixed this bug by modifying the 'update_ids()' function in class SumTree. I recall that I modified the iterate depth in that function. You can check the sum of all nodes of the binary tree, i.e., the tree[0], by ensuring that 'update_ids()' function functions correctly. However, it seems the weight calculation also embeds some bugs.