Open sergey-a-berezin opened 1 year ago
This, in particular, requires adding OHL prices to the DB in addition to the closing prices. For consistency, add unadjusted OHL prices. For Sharadar, reconstruct those from the Close/CloseSplitAdjusted
ratio. Use these ratios to extract various adjusted OHL prices later.
Or, to optimize for the common case, save the fully adjusted OHL prices and recover the other unadjusted / split-only adjusted using the corresponding closing price ratios.
Add an option to plot the conditional distribution of log(close/open)
when log(high/open) < T
for some threshold T
(target price). Since high
constrains close
, this conditional distribution will be biased towards lower values, and its mean will more accurately reflect the result of not hitting the target price.
To simplify, we can consider buying at the previous close
and selling at a threshold T
above the following open
or at the next close
. From this perspective, the distributions for high/close
(X_h
), open/close
(X_o
) and close/close
(X_c
) log-profits should be nearly the same Student's t-distribution. The actual value of high = max(X_o, X_h, X_c)
, and the selling price p = (high>T) ? T : X_c
. It is obvious that p < (high>T)? T : high = min(T, high)
.
So, we are effectively obtaining a new distribution of high
with a possibly slightly higher mean than the original distributions of X_i
's, but then we are cutting it from above and reducing the mean again. This is only useful if the reduced mean is still higher than the original mean of X_i
.
And of course, preliminary experiments indicate that the potential gain when high > T
is neatly offset by the expected loss when high < T
...
Perhaps, to settle the issue, we should:
min
and max
for low
and high
as appropriate;open
(or at the previous close
), then either sell at the threshold T
or at close
, and plot the mean of this strategy as a function of T
;More accurately, OHLC prices should be modeled by high frequency intraday log-profits, say, minutely, with an appropriate distribution - Student's t with the "minutely" parameter a_m
(denoted T(a=a_m)
) such that, when compounded 24*60 times, would produce daily distribution similar to T(a=3)
.
Next, we can generate close
from the previous close
using T(a=3)
, and then generate the previous 7.5 hours using the minutely distribution T(a=a_m)
walking backwards all the way to the open
price by subtracting the log-profit samples from close
. The generated intraday sequence is then summarized into the daily OHLC bar.
To have the ability to process minutely data, we need to extend db.Date
to store the day time in addition to the date. I'm thinking of adding Hour
, Minute
and Second
fields of type uint8
, leaving another byte for Milliseconds
(why not?) and thereby doubling the size of the struct to 8 bytes. This shouldn't add too much memory and storage requirements, certainly not as much as OHL prices added already.
Next, let's add a default header to the headless CSV tables for parfait-import
. This allows reading headless CSV tickers and prices which I happen to have lying around for minutely data.
A test import of minutely QQQ data worked well. Now I need to import ~100 most interesting stocks to get ~30M minutely samples and run them through the "distribution" experiment to derive a typical alpha
of the Student's t-distribution. Preliminary run on QQQ gives a=~2.5
.
Oh, and I also need to filter out overnight samples from the minitely data. Let's introduce Intraday
flag in the Source
config which would indicate to skip log-profits that span two days.
Preliminary results with minutely data, 83 high-volume tickers, 2 years worth of data, ~20M price points total, ~15M in-session points (9:30am - 4pm). The bullets below are for in-session data only.
alpha = 2.5
(consistent with my previous experiments from 1-2 years ago).-1.9e-7
, which corresponds to -2% APY. For reference, NASDAQ composite dropped -10% in these 2 years, so about -5% APY, which suggests that a large part of growth (even if negative) happens between sessions.
The last two points repeat similar observations with daily data. This confirms my suspicion that the pattern is likely at all timeframes.
This, in fact, should be enough to generate synthetic OHL prices using minutely generator, just use alpha=2.5
starting from close
and walking backwards towards open
. For simplicity we can assume mean=0
for minutely data, which seems to reflect the reality.
I'm extracting the Source
extension into a separate issue #130.
Some preliminary experiments with open[t+1]/close[t]
vs. close[t+1]/close[t]
log-profit distributions for NASDAQ Composite index:
Normalized distributions (by close/close
) over all the liquid stocks:
Next, implement a strategy simulator with the following strategies:
open
, sell either at a threshold open*T
or at close
; plot the expected gain for varying T
, compare with buy-and-hold;close[t]
, sell the next day at close[t]*T
or keep and sell at close[t+1]*T
, and so on.open
(or previous close
) and sell at a stop-loss.It may be a good idea to think of a relatively generic strategy config which independently sets conditions for buy and sell.
open
" or "buy at close
" (since we assume no historical dependency).open
or close
", "limit at price", "stop loss at price or percentage". All the conditions will be scanned at each intraday bar, the first one that applies will be executed.I'm going to implement a separate simulator
experiment to test actual strategies - #132. I'm not yet sure what to do with this trading experiment as such, perhaps it should be folded into the distribution
experiment to study the various intraday distributions and their modeling.
Testing the simulator on "buy-sell intraday"
strategy on synthetic data with 0 mean (no inherent growth, only volatility) yields basically zero profit no matter how I set up the day trading strategy - buy & hold (for reference; it obviously won't do any good), target sell, stop-loss - fixed or trailing, selling at close or keeping overnight, etc. Any combination invariably leads to the same result: no profit.
I'm coming back to the same conclusion: any actual profit comes from the average, inherent growth of the stock value. The only question is, can we somehow protect the value from crashes and/or recover faster? This is really the fundamental question of a "safe haven" strategy.
Consider a trading strategy which attempts to make money on volatility but without prediction:
The goal is to see if there is any strategy that would do better than buy-and-hold.
Study the distribution of log-profits for
high/open
andclose/open
and see if it may help pick good parameters.