Open AlexCatarino opened 1 year ago
spy_qqq_cov.csv columns: "SPY close" "QQQ close" "annual covariance"
script:
import talib
import pandas as pd
history_spy = pd.read_csv("https://github.com/QuantConnect/Lean/raw/master/Data/equity/usa/daily/spy.zip",
index_col=0, names=["open", "high", "low", "close", "volume"])
history_qqq = pd.read_csv("https://github.com/QuantConnect/Lean/raw/master/Data/equity/usa/daily/qqq.zip",
index_col=0, names=["open", "high", "low", "close", "volume"])
close = history_spy.close
# Assumes no risk free rate
history2 = pd.concat([close, history_qqq.close], axis=1).dropna()
history2.columns = ["spy", "qqq"]
cov = history2.rolling(252).cov().unstack(1)["spy"]["qqq"].dropna().to_frame()
cov.columns = ["cov"]
cov["spy close"] = close
cov["qqq close"] = history2.iloc[:, 1]
cov = cov.iloc[:, [1, 2, 0]].dropna()
cov.to_csv("spy_qqq_cov.csv", header=False)
Hi @LouisSzeto I've tried running your python script but I get quite different output to either of the files you posted: spy_qqq_cov.csv
It's not just the last column that's different but the ones sourced from the history zips too.
I have looked at other PRs for guidance but am a bit stuck. Would love to finish implementing this indicator. Any pointers to get my output the same as yours would be greatly appreciated and enable me to get my tests passing and submit my PR.
Hi @stevespringer! Sorry for the confusion. I actually run the script in LEAN by qb.History(...)
instead of pd.read_csv
, so the price is not the same as LEAN uses adjusted price and different decimal places (covariance is sensitive to scale). I made a mistake as well since we should use change in price instead of price (then scale doesn't matter here). Here you are with the corrected script and data.
import pandas as pd
history_spy = pd.read_csv("https://github.com/QuantConnect/Lean/raw/master/Data/equity/usa/daily/spy.zip",
index_col=0, names=["open", "high", "low", "close", "volume"])
history_qqq = pd.read_csv("https://github.com/QuantConnect/Lean/raw/master/Data/equity/usa/daily/qqq.zip",
index_col=0, names=["open", "high", "low", "close", "volume"])
close = history_spy.close
# Assumes no risk free rate
history2 = pd.concat([close, history_qqq.close], axis=1).dropna()
history2.columns = ["spy", "qqq"]
cov = history2.pct_change().rolling(252).cov().unstack(1)["spy"]["qqq"].dropna().to_frame()
cov.columns = ["cov"]
cov["spy close"] = close
cov["qqq close"] = history2.iloc[:, 1]
cov = cov.iloc[:, [1, 2, 0]].dropna()
cov.to_csv("spy_qqq_cov.csv", header=False)
Thanks very much @LouisSzeto !
Expected Behavior
Lean supports Covariance as a Lean Indicator which takes a Symbol object, a benchmark Symbol, and a look-back period.
Actual Behavior
Not supported. See #83
Potential Solution
Implement this indicator. Lean currently implemented it internally in the
Beta
indicator: Beta.cs#L160.Checklist
master
branch