Closed Jiajia-Cui closed 2 years ago
"Output: required LP margin over 20 trials" is this 20 time series? Or some other statistic based on this.
"Optimal strategy with Multiple LP (uninformed trading bots)" should they have exactly the same objective or different parameters e.g. aversion to holding inventory?
"might need reinforcements leaning at this point, to be discussed" We need to do all this analysis assuming we've got no RL working.
The RL stuff is another, parallel stream (and it may well not go anywhere for the next few months while we work on the null-chain speed etc).
@Jiajia-Cui I still don't see how, based on all the outputs, we'll decide what we think is "reasonable". What do you propose we target? Reasonable average return? VaR of return? (if so I am not sure 20 trials are enough). Average across all the "environments" or across one of them?
"Output: required LP margin over 20 trials" is this 20 time series? Or some other statistic based on this.
They are 20 time series
Thoughts distilled from my longer list on slack:
"Optimal strategy with Multiple LP (uninformed trading bots)" should they have exactly the same objective or different parameters e.g. aversion to holding inventory?
If there are multiple LPs with different risk aversion parameters, then the optimal strategy will be different from if there are multiple LPs with the same risk aversion parameter, I would try to run multi LP with the same risk aversion first, and then change the risk aversion parameters on some of the LPs.
Stupid question but if the LPs have the same risk aversion and are running under the same decision algorithm, how do they differ from one another?
Thoughts distilled from my longer list on slack:
What's 'a trial'?
- How long is it?
- How much trading is there?
- How is the price process generated? (volatility etc)
- Perhaps the volatility/spread/arrival frequency metrics should be decided randomly at the start as well as just the process generation itself?
- Should we compact the environment spec into being defined by proportion of informed traders (0, 0.25, 0.5, 0.75, 1 etc)?
Stupid question but if the LPs have the same risk aversion and are running under the same decision algorithm, how do they differ from one another?
For example, if we have 2 LPs (LP1 and LP2), and they have the same risk aversion. If the first trade happens to be with LP1 and trading bot on the market, then LP1's position will change, hence the optimal strategy for LP1 will change, etc. Therefore, LP1 and LP2 will have different "behaviour".
@Jiajia-Cui I still don't see how, based on all the outputs, we'll decide what we think is "reasonable". What do you propose we target? Reasonable average return? VaR of return? (if so I am not sure 20 trials are enough). Average across all the "environments" or across one of them?
If we are looking at the average return, then 20 trails is enough, if we are look at VaR, then we might need 1000 at least to get a reasonable number. The current speed is about 1.5 second per trial from my laptop, Tom has merged a faster version (with reduced wallet content) so it might be faster
Naively, 180 time steps feels like that could be quite short for observing anything that isn't immediately obvious, but perhaps most changes produce differences noticeable on that timescale? (or are not too sensitive on volatility so as you say we can just assume a timestep is an epoch?)
This is now in https://github.com/vegaprotocol/vega-market-sim/
Task Overview
As a researcher, I would like to find out the impact from contentious parameters (market.liquidity.probabilityOfTrading.tau.scaling), so that I can set a reasonable default value for them.
Spec for this parameter is here
Input and output Metrics
Environment for LPs:
_Env_LP001: Optimal strategy with Single LP (uninformed trading bots)
_Env_LP002: Optimal strategy with Multiple LP (uninformed trading bots)
_Env_LP003: Optimal strategy with Single LP (proportion of informed traders)
_Env_LP004: Optimal strategy with Multiple LP (proportion of informed traders)