Closed YukeWang96 closed 2 years ago
Hi,
What is the meaning of the minus Rscore from the output I get? Should I apply abs(Rscore) to get the actual reward? and how do I know when it is near to converge?
abs(Rscore)
[Time: 943] [Episode: 1046 Score: -20.0000] [RScore: -20.3550 RPPS: 1255] [PPS: 1228 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 943] [Episode: 1047 Score: -20.0000] [RScore: -20.3550 RPPS: 1256] [PPS: 1229 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 943] [Episode: 1048 Score: -19.0000] [RScore: -20.3530 RPPS: 1259] [PPS: 1230 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 944] [Episode: 1049 Score: -21.0000] [RScore: -20.3530 RPPS: 1259] [PPS: 1231 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 944] [Episode: 1050 Score: -19.0000] [RScore: -20.3520 RPPS: 1258] [PPS: 1231 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 944] [Episode: 1051 Score: -20.0000] [RScore: -20.3530 RPPS: 1259] [PPS: 1232 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 945] [Episode: 1052 Score: -20.0000] [RScore: -20.3530 RPPS: 1259] [PPS: 1233 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 947] [Episode: 1053 Score: -20.0000] [RScore: -20.3530 RPPS: 1257] [PPS: 1231 TPS: 208] [NT: 2 NP: 2 NA: 33] [Time: 947] [Episode: 1054 Score: -20.0000] [RScore: -20.3530 RPPS: 1257] [PPS: 1232 TPS: 208] [NT: 2 NP: 2 NA: 33]
Thanks!
Hi,
What is the meaning of the minus Rscore from the output I get? Should I apply
abs(Rscore)
to get the actual reward? and how do I know when it is near to converge?Thanks!