stockparfait / experiments

Statistical experiments with financial data
Apache License 2.0
0 stars 0 forks source link

Implement trading experiment #116

Open sergey-a-berezin opened 1 year ago

sergey-a-berezin commented 1 year ago

Consider a trading strategy which attempts to make money on volatility but without prediction:

The goal is to see if there is any strategy that would do better than buy-and-hold.

Study the distribution of log-profits for high/open and close/open and see if it may help pick good parameters.

sergey-a-berezin commented 1 year ago

This, in particular, requires adding OHL prices to the DB in addition to the closing prices. For consistency, add unadjusted OHL prices. For Sharadar, reconstruct those from the Close/CloseSplitAdjusted ratio. Use these ratios to extract various adjusted OHL prices later.

Or, to optimize for the common case, save the fully adjusted OHL prices and recover the other unadjusted / split-only adjusted using the corresponding closing price ratios.

sergey-a-berezin commented 1 year ago

Add an option to plot the conditional distribution of log(close/open) when log(high/open) < T for some threshold T (target price). Since high constrains close, this conditional distribution will be biased towards lower values, and its mean will more accurately reflect the result of not hitting the target price.

sergey-a-berezin commented 1 year ago

To simplify, we can consider buying at the previous close and selling at a threshold T above the following open or at the next close. From this perspective, the distributions for high/close (X_h), open/close (X_o) and close/close (X_c) log-profits should be nearly the same Student's t-distribution. The actual value of high = max(X_o, X_h, X_c), and the selling price p = (high>T) ? T : X_c. It is obvious that p < (high>T)? T : high = min(T, high).

So, we are effectively obtaining a new distribution of high with a possibly slightly higher mean than the original distributions of X_i's, but then we are cutting it from above and reducing the mean again. This is only useful if the reduced mean is still higher than the original mean of X_i.

sergey-a-berezin commented 1 year ago

And of course, preliminary experiments indicate that the potential gain when high > T is neatly offset by the expected loss when high < T...

Perhaps, to settle the issue, we should:

sergey-a-berezin commented 1 year ago

More accurately, OHLC prices should be modeled by high frequency intraday log-profits, say, minutely, with an appropriate distribution - Student's t with the "minutely" parameter a_m (denoted T(a=a_m)) such that, when compounded 24*60 times, would produce daily distribution similar to T(a=3).

Next, we can generate close from the previous close using T(a=3), and then generate the previous 7.5 hours using the minutely distribution T(a=a_m) walking backwards all the way to the open price by subtracting the log-profit samples from close. The generated intraday sequence is then summarized into the daily OHLC bar.

sergey-a-berezin commented 1 year ago

To have the ability to process minutely data, we need to extend db.Date to store the day time in addition to the date. I'm thinking of adding Hour, Minute and Second fields of type uint8, leaving another byte for Milliseconds (why not?) and thereby doubling the size of the struct to 8 bytes. This shouldn't add too much memory and storage requirements, certainly not as much as OHL prices added already.

sergey-a-berezin commented 1 year ago

Next, let's add a default header to the headless CSV tables for parfait-import. This allows reading headless CSV tickers and prices which I happen to have lying around for minutely data.

sergey-a-berezin commented 1 year ago

A test import of minutely QQQ data worked well. Now I need to import ~100 most interesting stocks to get ~30M minutely samples and run them through the "distribution" experiment to derive a typical alpha of the Student's t-distribution. Preliminary run on QQQ gives a=~2.5.

sergey-a-berezin commented 1 year ago

Oh, and I also need to filter out overnight samples from the minitely data. Let's introduce Intraday flag in the Source config which would indicate to skip log-profits that span two days.

sergey-a-berezin commented 1 year ago

Preliminary results with minutely data, 83 high-volume tickers, 2 years worth of data, ~20M price points total, ~15M in-session points (9:30am - 4pm). The bullets below are for in-session data only.

The last two points repeat similar observations with daily data. This confirms my suspicion that the pattern is likely at all timeframes.

sergey-a-berezin commented 1 year ago

This, in fact, should be enough to generate synthetic OHL prices using minutely generator, just use alpha=2.5 starting from close and walking backwards towards open. For simplicity we can assume mean=0 for minutely data, which seems to reflect the reality.

sergey-a-berezin commented 1 year ago

I'm extracting the Source extension into a separate issue #130.

sergey-a-berezin commented 1 year ago

Some preliminary experiments with open[t+1]/close[t] vs. close[t+1]/close[t] log-profit distributions for NASDAQ Composite index:

Normalized distributions (by close/close) over all the liquid stocks:

sergey-a-berezin commented 1 year ago

Next, implement a strategy simulator with the following strategies:

sergey-a-berezin commented 1 year ago

It may be a good idea to think of a relatively generic strategy config which independently sets conditions for buy and sell.

sergey-a-berezin commented 1 year ago

I'm going to implement a separate simulator experiment to test actual strategies - #132. I'm not yet sure what to do with this trading experiment as such, perhaps it should be folded into the distribution experiment to study the various intraday distributions and their modeling.

sergey-a-berezin commented 1 year ago

Testing the simulator on "buy-sell intraday" strategy on synthetic data with 0 mean (no inherent growth, only volatility) yields basically zero profit no matter how I set up the day trading strategy - buy & hold (for reference; it obviously won't do any good), target sell, stop-loss - fixed or trailing, selling at close or keeping overnight, etc. Any combination invariably leads to the same result: no profit.

I'm coming back to the same conclusion: any actual profit comes from the average, inherent growth of the stock value. The only question is, can we somehow protect the value from crashes and/or recover faster? This is really the fundamental question of a "safe haven" strategy.