Open chriss1245 opened 10 months ago
In order to perform imitation learning, we need that our domain experts (Markowitz models) label the data with state-action pairs
There was implemented the buffer on 0826b0fc2c0b39afa708d1cf38143a3fb83be38e.
In order to perform imitation learning, we need that our domain experts (Markowitz models) label the data with state-action pairs