khrapovs / vix

Compute VIX and related volatility indices
98 stars 34 forks source link

Failed to run the notebook with python 3.12 and Pandas 2.2.2 #28

Closed bg1szd closed 7 months ago

bg1szd commented 7 months ago

The error happened in this line: options4['dK'] = options4.groupby(level = ['Date','Days'])['Strike'].apply(compute_adjoining_strikes_diff)

Exception has occurred: ValueError cannot handle a non-unique multi-index!

options4 shape is (246,3) options4.index is MultiIndex, like MultiIndex([('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ('2009-01-01', 9), ... ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37), ('2009-01-01', 37)], names=['Date', 'Days'], length=246)

So any idea on this issue? Thanks!

khrapovs commented 7 months ago

Thanks for reporting the issue. I will have time to look at it on the weekend. Sorry if it is too long, but this is the best I can do at the moment...

bg1szd commented 7 months ago

Sure. Thanks. It seems the dataframe returned by options4.groupby(level = ['Date','Days'])['Strike'].apply(compute_adjoining_strikes_diff) is as below.

There are 2 duplicated indexes... I am investigating if there is any workaround to remove the duplicated one.

                               dK

Date Days Date Days
2009-01-01 9 2009-01-01 9 25.0 9 25.0 9 22.5 9 12.5 9 5.0 ... ... 37 2009-01-01 37 5.0 37 5.0 37 5.0 37 5.0 37 5.0

[246 rows x 1 columns]

bg1szd commented 7 months ago

I did some work around here to get an expected dataframe.
`def calc_diff_between_adjoining_strikes(options4):

def compute_adjoining_strikes_diff(group):
    new = group.copy()
    new.iloc[1:-1] = np.array((group.iloc[2:] - group.iloc[:-2]) / 2)
    new.iloc[0] = group.iloc[1] - group.iloc[0]
    new.iloc[-1] = group.iloc[-1] - group.iloc[-2]
    return new

#options4['dK'] = options4.groupby(level = ['Date','Days'])['Strike'].apply(compute_adjoining_strikes_diff)
option4_copy = options4.copy()
options4_dk = pd.DataFrame()
options4_dk['dK'] = options4.groupby(level = ['Date','Days'])['Strike'].apply(compute_adjoining_strikes_diff)
options4_dk.reset_index(level=2,inplace=True)
options4_dk.reset_index(level=2,inplace=True)
options4_dk_series = pd.Series(options4_dk['dK'])
option4_copy['dK'] = options4_dk_series

return option4_copy`
khrapovs commented 7 months ago

@bg1szd I have updated the notebook. It runs now. Please, feel free to reopen the issue if the problem persists. Thanks for pushing me to maintain it!