quantopian / alphalens

Performance analysis of predictive (alpha) stock factors
http://quantopian.github.io/alphalens
Apache License 2.0
3.18k stars 1.12k forks source link

A way to solve ValueError: Inferred frequency None from passed values does not conform to passed frequency C #371

Open wukan1986 opened 4 years ago

wukan1986 commented 4 years ago

I find a way to solve the ValueError I resample the date

factor = factor.resample('M').last()
# ValueError: Inferred frequency None from passed values does not conform to passed frequency C
def _validate_frequency(cls, index, freq, **kwargs):
    return None

pd.core.arrays.datetimelike.DatetimeLikeArrayMixin._validate_frequency = _validate_frequency
# <MonthEnd>
factor.index.levels[0].freq
factor_data = alphalens.utils \
    .get_clean_factor_and_forward_returns(factor, prices,
                                          groupby=groupby,
                                          binning_by_group=binning_by_group,
                                          quantiles=quantiles, bins=bins,
                                          periods=periods,
                                          filter_zscore=filter_zscore,
                                          groupby_labels=groupby_labels)
# set freq to C
factor_data.index.levels[0].freq = 'C'
tfrojd commented 4 years ago

Thanks for sharing!

create_returns_tear_sheet() now seems very slow for me though and gives me the following warnings.

/opt/conda/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py:837: PerformanceWarning: Non-vectorized DateOffset being applied to Series or DatetimeIndex PerformanceWarning,

alexandrnikitin commented 4 years ago

@wukan1986 Is that related to pandas 1.0.0 or something else?

wukan1986 commented 4 years ago

@alexandrnikitin pandas>=0.25

vista852 commented 4 years ago

It works , Thanks a lot!

tongxinw commented 3 years ago

Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?

wukan1986 commented 3 years ago

Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?

df.resample().apply(lambda x:x.tail(2))

tongxinw commented 3 years ago

Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?

df.resample().apply(lambda x:x.tail(2))

Thanks for your reply! In my understanding, tail(2) would keep the bottom two rows of a certain date, right? What if the data I have only contained month-end prices and factors and I would like to keep all of them. Say I have 1000 assets for 2000-01-31 and thousands of assets for other months as well. I don't think resample('M') would work in this way though. Please feel free to correct me.

wukan1986 commented 3 years ago

Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?

df.resample().apply(lambda x:x.tail(2))

Thanks for your reply! In my understanding, tail(2) would keep the bottom two rows of a certain date, right? What if the data I have only contained month-end prices and factors and I would like to keep all of them. Say I have 1000 assets for 2000-01-31 and thousands of assets for other months as well. I don't think resample('M') would work in this way though. Please feel free to correct me.

series.unstack().resample()

Karlish-OMG commented 2 years ago

I find a way to solve the ValueError I resample the date

factor = factor.resample('M').last()
# ValueError: Inferred frequency None from passed values does not conform to passed frequency C
def _validate_frequency(cls, index, freq, **kwargs):
    return None

pd.core.arrays.datetimelike.DatetimeLikeArrayMixin._validate_frequency = _validate_frequency
# <MonthEnd>
factor.index.levels[0].freq
factor_data = alphalens.utils \
    .get_clean_factor_and_forward_returns(factor, prices,
                                          groupby=groupby,
                                          binning_by_group=binning_by_group,
                                          quantiles=quantiles, bins=bins,
                                          periods=periods,
                                          filter_zscore=filter_zscore,
                                          groupby_labels=groupby_labels)
# set freq to C
factor_data.index.levels[0].freq = 'C'

@wukan1986 @luca-s Thanks for sharing. But this method seems only change the freq from 'None' to 'M'. I adopted this method on my minute-frequency Data, like:

factor_df0=factor_df.unstack().resample('T').last()
factor_df=factor_df0.stack()

price_df=price_df.resample('T').last()
price_df.index.freq

but the warning still arise: Inferred frequency T from passed values does not conform to passed frequency C

delsonlizelin commented 2 years ago

Thank you. This works for me.

AronTian2018 commented 1 year ago

you can use date list to filter the date date that has freq=none, such as: date_list = df.date.groupby('date').min()

than filter factors data by (factors data that not throw to alphalens): factors.query('date in @date_list‘)

than it works using func "get_clean_factor_and_forward_returns()"

AronTian2018 commented 1 year ago

factor_df0=factor_df.unstack().resample('T').last() factor_df=factor_df0.stack()

price_df=price_df.resample('T').last() price_df.index.freq

this works

gamaiun commented 1 year ago

I find a way to solve the ValueError I resample the date

factor = factor.resample('M').last()
# ValueError: Inferred frequency None from passed values does not conform to passed frequency C
def _validate_frequency(cls, index, freq, **kwargs):
    return None

pd.core.arrays.datetimelike.DatetimeLikeArrayMixin._validate_frequency = _validate_frequency
# <MonthEnd>
factor.index.levels[0].freq
factor_data = alphalens.utils \
    .get_clean_factor_and_forward_returns(factor, prices,
                                          groupby=groupby,
                                          binning_by_group=binning_by_group,
                                          quantiles=quantiles, bins=bins,
                                          periods=periods,
                                          filter_zscore=filter_zscore,
                                          groupby_labels=groupby_labels)
# set freq to C
factor_data.index.levels[0].freq = 'C'

Works like a charm! Thank you!