AlexanderGuillermoSeguraBallesteros / RRL

Recurrent Reinforcement Learning (RRL)This is a repository for the implementations of RRL, mainly following Moody's work, other authors will be given credit as well on the go.
9 stars 4 forks source link

I think there is something wrong in the reward function? #4

Open nickhuangxinyu opened 5 years ago

nickhuangxinyu commented 5 years ago

Hi, Alexander.

I think your RewardFunction is not correct for max sum profit. If you want to use Rt=μ{F(t-1) r_t-δ|Ft-F(t-1) |}, the rt should be zt-zt-1, please see the paper "Learning to trade via Direct Reinforcement" formula 5.

hope can have discussion, Thanks Nick

AlexanderGuillermoSeguraBallesteros commented 5 years ago

Hello,

Thank you for your comments, of course, I would love to have a discussion with you. Its been a while since I checked the code and the project and I have been looking forward to retaking the project. Let me review the paper and I can provide a comment about you observation.

Sincerely,

Alexander Guillermo Segura Ballesteros

On Fri, Oct 25, 2019 at 10:17 PM xyhuang notifications@github.com wrote:

Hi, Alexander.

I think your RewardFunction is not correct for max sum profit. If you want to use Rt=μ{F(t-1) r_t-δ|Ft-F(t-1) |}, the rt should be zt-zt-1, please see the paper "Learning to trade via Direct Reinforcement" formula 5.

hope can have discussion, Thanks Nick

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AlexanderGuillermoSeguraBallesteros/RRL/issues/4?email_source=notifications&email_token=ACXCKZ2FJT3FFYH4ABJTQETQQOZDLA5CNFSM4JFLSGRKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HUQOOVA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXCKZ2OOORGHOVDJAKBIMTQQOZDLANCNFSM4JFLSGRA .