Note that hold and sell signals do not have to be optimized as they are executed anyways
Idea No. 1
Somehow generate the best combination of buys and train a model based on these signals. Can use the feature extractor from stock-wise training (#131).
Problems
Dataset can not be easily created when brute-forcing since the number of possible buy-combinations is 2**n_buy_signals.
Idea No. 2
Based on the backtracked rewards, generate a heuristic to create the dataset. E.g. always buy the first 3 signals or only buy signals that have backtracked rewards above a certain limit.
Problems
Doesn't take balance into consideration
Not really the optimum
Reinforcement Learning
Idea
Generate an env that processes multiple stocks in parallel (one giant vector). E.g. don't create an optimal dataset directly.
Problems
No variable amount of stocks is possible as we have to fix the number of stocks per day
Can't feed the raw state, will need to pre-process it prior to feeding into rl agent
Objective
The optimal portfolio strategy is defined by:
with subject to the following constraints
Approaches
Supervised
Note that hold and sell signals do not have to be optimized as they are executed anyways
Idea No. 1
Somehow generate the best combination of buys and train a model based on these signals. Can use the feature extractor from stock-wise training (#131).
Problems
Idea No. 2
Based on the backtracked rewards, generate a heuristic to create the dataset. E.g. always buy the first 3 signals or only buy signals that have backtracked rewards above a certain limit.
Problems
Reinforcement Learning
Idea
Generate an env that processes multiple stocks in parallel (one giant vector). E.g. don't create an optimal dataset directly.
Problems
Advantages
Use scipy.optimize
Idea
Use scipy.optimize to find optimal values.
Problems