Open fengxijia opened 1 year ago
reward could be negative. Although a positive reward with some hyperparamerter/preference weight tunning will make more sense, the tunning is out of the scope of this project. This project is out of data and we can not maintain the project anymore.
Thank you for your attention.
Dear Authors, Hi, may I ask why the reward is all negative in the figure generated by this code? The reward recorded in the code seems to be f1 and f2, which is the income minus cost, but should it still be positive? Thank you very much.
hi,did you solve this problem?
Hi, yes, it's a problem of the unit. The ISO charging price is the real price, e.g., 100, but the charging price (action) generated by the code has been scaled down to 1/20 (not sure), e.g. 5. That's why it's always negative. Besides, the charging price and rate seem to mix up in Class Net. Can make a few modifications there. Otherwise, the output price is actually the rate and vice versa. @wsyCUHK Thanks for your previous reply, would you mind checking if I am right?
Thanks for your reply. After your explanation, I can understand the problem, but it is difficult for me to modify the code, can you teach me how I should change the code?
Yes, that is exactly what I mean. Thank you for your help to make the details more clear.
Dear Authors, Hi, may I ask why the reward is all negative in the figure generated by this code? The reward recorded in the code seems to be f1 and f2, which is the income minus cost, but should it still be positive? Thank you very much.