Open SuGuilin opened 1 year ago
RL algorithms are used to manage portfolios by finding the best combination of stocks for each day. The state (observation) is the prices, indicators, etc. The action is to buy a specific set of stocks and the reward is the profit. Everything else is the standard configuration of the DDPG and other algorithms.
Can you explain the principles of code design, especially how the ddpg algorithm relates to portfolios