Open andrewsonin opened 2 years ago
I have concern about this too, from my experiment result, without using m- to label, the final accuracy drop a lot .
I also found this issue; the high accuracy is due to the label.
Could you point me to the place in the Jupyter notebook where you see the issue with the label? I do not see the issue in the code.
The Juypter notebook uses FI-2010 where labels have been already created according to equation 8 in https://arxiv.org/pdf/1705.03233.pdf
Just wanna make sure to not make a mistake. Thanks.
See discussion in bottom half of left column on page 4.
I think there is no issue with neither equation 3 nor 4 to use as targets. The accuracy drops to 50%, because it is less noisier, hence classification works better. You can calculate m- in a real trading system keeping the values in a cache. That should work.
Would you explain in more detail why this target engineering makes it useless for a trading system and where you see the information leakage (sorry don't quite understand the formulas). Thanks.
@deeepwin I don't see the leakage argument either, p_t enters m- and not m+ according to (2). m- is the average of p_t and k-1 earlier prices. Moreover, I'm not sure why, when the difference between p_t and the average of p_t itself and those earlier prices approaches infinity, future prices would be even higher with certainty.
However, the smooth target that uses m- makes it hard to trade on since you can't buy the last k-1 ticks when you get the buy signal at t, unless I'm missing something. Unfortunately, as the authors state, the experiments with p_t alone (which is the mid-point, not even the more adverse ask) appear to take us back to a coin flip.
@stefan-jansen thanks for the feedback.
the smooth target that uses m- makes it hard to trade on since you can't buy the last k-1 ticks when you get the buy signal at t, unless I'm missing something.
Yes, I also see it like that. I guess that is what the other meant. But that does not matter in my opinion. Because m- represents a state of the system, if that state triggers a buy signal through the model, it would expect m+ state in the future. This is regardless of current p(t) value. That value will never really match anyways (slippage). If you validate the model on truly unseen data (out of sample) one would assume based on the papers results higher returns than a flip coin. I guess that would need to be carefully tested.
Hello to everyone, I know that is passed 1 year and a half, but if possible I want to reopen the issue. Sincerely, I don't see any information leak, I understand the formulas but I don't understand what you have demonstrated with them. They consider m+ and m- to have a smoother labeling method and to try to catch up and down trends. In no way is the will to trade at past values expected.
While @andrewsonin may have overstated the leakage (h/t @stefan-jansen), his conclusion that the 2020 version of DeepLob was worthless may be true from a LOB trader's perspective. Of course DeepLob was not designed to be a trader's tool. I suspect it was designed to show that layered networks are rendering hand-crafted features obsolete in even the most demanding environments, and to point the way for further research.
When I test DeepLob using LOB data streamed from IBrokers I find: 1. that it can be a source of signal for algorithms with a 2- to 8-second horizon, 2. the results are inline with @LeonardoBerti00 et al's findings in table 2 here. Namely, 1. While performance drops on new data, its f1 is ~0.58 (as is its balanced accuracy, but that will depend on your selections of alpha and k), 2. DeepLOB is the best all performer in the 2017-2023 cohort.
Problem
I was able to successfully transfer the model you proposed to the USD/RUB currency pair, which is traded on the Moscow Exchange, and classify the further behavior of the quotes within next 10 minutes. However, I was constantly failing to create a profitable trading strategy based on your model using the smoothed target you suggested:
It seemed strange to me, since the following two factors support the assumption that at least some kind of strategy can still be built:
Then I realized that the target you proposed contains a rather obvious information leak, which can be described by the following formula:
Moreover:
And this is what makes your target useless from the trading perspective, since at time t you cannot make deals at price m-minus(t).
To prove this I trained
sklearn
LogisticRegression
using only ONE feature, which is the difference between the current price and m-minus term in the proposed target. ROC-AUC of such a simple model increased to 81%.Then I refitted your model on the "noisy" alternative target, which is meaningful from the trading perspective (unlike the previous one):
ROC-AUC dropped to a value slightly greater than 50%, and this was not enough to beat even the bid-ask spread.
Conclusion
My experiment shows that the success you reached using the above target is not due to any "smoothing" as you say and the lack of noise, but only due to the bad target design. You simply made your target conditionally dependent on the trivially extracted feature information. Moreover, it is absolutely useless from the trading perspective, since at time t we cannot make deals at price m-minus(t), but only at something near to p(t).