Equim-chan / Mortal

🚀🀄️ A fast and strong AI for riichi mahjong, powered by Rust and deep reinforcement learning.
https://mortal.ekyu.moe
GNU Affero General Public License v3.0
929 stars 118 forks source link

reward_calculator.py中calc_delta_points方法计算结果存在数值错误 #63

Closed Koyonomi closed 9 months ago

Koyonomi commented 9 months ago
72963936b64950ad3c75fc254feb1a2c

我尝试使用calc_delta_points方法去计算每名玩家的每轮点数变化。在每一局的末尾轮,都会有一个不合理的负值存在。我不知道它代表什么含义。

Equim-chan commented 9 months ago

目前 Mortal 没有使用到 calc_delta_points 方法,既然如此,你至少应该附上你写的那部分调用它的代码,以及相应的输入,不然我要怎么去复现呢。

Koyonomi commented 9 months ago

我使用的是reward_calculator.py已有方法去计算 reward_calculator.py def calc_delta_points(self, player_id, grp_feature, final_scores): seq = np.concatenate((grp_feature[:, 3 + player_id] * 1e5, np.array([final_scores[player_id]]))) delta_points = seq[1:] - seq[:-1] return delta_points dataloader.py grp_feature = grp.take_feature() final_scores = grp.take_final_scores() player_id = game.take_player_id() kyoku_points = self.reward_calc.calc_delta_points(player_id, grp_feature, final_scores) `

Koyonomi commented 9 months ago

感谢您为此做出的努力,我已应用该修改。 同时,我已确定新的数值错误源于浮点数的精度限制,该错误不会影响到实际训练 Screenshot_20231120_153940 image