Open wukan1986 opened 4 years ago
Thanks for sharing!
create_returns_tear_sheet() now seems very slow for me though and gives me the following warnings.
/opt/conda/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py:837: PerformanceWarning: Non-vectorized DateOffset being applied to Series or DatetimeIndex PerformanceWarning,
@wukan1986 Is that related to pandas 1.0.0 or something else?
@alexandrnikitin pandas>=0.25
It works , Thanks a lot!
Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?
Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?
df.resample().apply(lambda x:x.tail(2))
Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?
df.resample().apply(lambda x:x.tail(2))
Thanks for your reply! In my understanding, tail(2) would keep the bottom two rows of a certain date, right? What if the data I have only contained month-end prices and factors and I would like to keep all of them. Say I have 1000 assets for 2000-01-31 and thousands of assets for other months as well. I don't think resample('M') would work in this way though. Please feel free to correct me.
Thanks for sharing! I have another question while using resampling. resample.last() only keeps 1 row for each month, what if I have multiple rows for each month-end and I would like to keep all of them?
df.resample().apply(lambda x:x.tail(2))
Thanks for your reply! In my understanding, tail(2) would keep the bottom two rows of a certain date, right? What if the data I have only contained month-end prices and factors and I would like to keep all of them. Say I have 1000 assets for 2000-01-31 and thousands of assets for other months as well. I don't think resample('M') would work in this way though. Please feel free to correct me.
series.unstack().resample()
I find a way to solve the ValueError I resample the date
factor = factor.resample('M').last()
# ValueError: Inferred frequency None from passed values does not conform to passed frequency C def _validate_frequency(cls, index, freq, **kwargs): return None pd.core.arrays.datetimelike.DatetimeLikeArrayMixin._validate_frequency = _validate_frequency
# <MonthEnd> factor.index.levels[0].freq
factor_data = alphalens.utils \ .get_clean_factor_and_forward_returns(factor, prices, groupby=groupby, binning_by_group=binning_by_group, quantiles=quantiles, bins=bins, periods=periods, filter_zscore=filter_zscore, groupby_labels=groupby_labels)
# set freq to C factor_data.index.levels[0].freq = 'C'
@wukan1986
@luca-s
Thanks for sharing. But this method seems only change the freq
from 'None' to 'M'. I adopted this method on my minute-frequency Data, like:
factor_df0=factor_df.unstack().resample('T').last()
factor_df=factor_df0.stack()
price_df=price_df.resample('T').last()
price_df.index.freq
but the warning still arise: Inferred frequency T from passed values does not conform to passed frequency C
Thank you. This works for me.
you can use date list to filter the date date that has freq=none, such as: date_list = df.date.groupby('date').min()
than filter factors data by (factors data that not throw to alphalens): factors.query('date in @date_list‘)
than it works using func "get_clean_factor_and_forward_returns()"
factor_df0=factor_df.unstack().resample('T').last() factor_df=factor_df0.stack()
price_df=price_df.resample('T').last() price_df.index.freq
this works
I find a way to solve the ValueError I resample the date
factor = factor.resample('M').last()
# ValueError: Inferred frequency None from passed values does not conform to passed frequency C def _validate_frequency(cls, index, freq, **kwargs): return None pd.core.arrays.datetimelike.DatetimeLikeArrayMixin._validate_frequency = _validate_frequency
# <MonthEnd> factor.index.levels[0].freq
factor_data = alphalens.utils \ .get_clean_factor_and_forward_returns(factor, prices, groupby=groupby, binning_by_group=binning_by_group, quantiles=quantiles, bins=bins, periods=periods, filter_zscore=filter_zscore, groupby_labels=groupby_labels)
# set freq to C factor_data.index.levels[0].freq = 'C'
Works like a charm! Thank you!
I find a way to solve the ValueError I resample the date