AI4Finance-Foundation / FinRL

FinRL: Financial Reinforcement Learning. 🔥
https://ai4finance.org
MIT License
9.36k stars 2.27k forks source link

Reward shaping in RL algorithms using benchmark returns #1220

Open arunbharadwaj2009 opened 2 months ago

arunbharadwaj2009 commented 2 months ago

I want to build an RL algo that will understand the concept of beating a benchmark (say S&P500), at a tic level. So if a tic is constantly beating the benchmark, the algo should prefer to pick that tic more often, versus a tic that keeps losing to the benchmark.

How should I make this happen?

Can I setup a feature that keeps checking on monthly basis, if a tic beat the benchmark and sends this as a signal to the RL algo? It could be a binary or a numeric feature (delta between tic and benchmark monthly return). But even then this, will be just a feature and is not really altering the reward signal. How do I alter the reward signal to achieve this?

zhumingpassional commented 1 month ago

after training a agent. Introduce a feature, which is a mask, i.e., being 1 if beating and 0 otherwise.