SimFin / simfin-tutorials

Tutorials for SimFin - Simple financial data for Python
https://simfin.com/
Other
260 stars 68 forks source link

Masking on specific date #8

Closed freekeys closed 3 years ago

freekeys commented 3 years ago

Hello,

Sorry if this is a basic question! I have a MultiIndex df similar to the one in the Screener tutorial. I want to add a date mask for when Date == current_date. Where current date is for example current_date = datetime.datetime(2020, 7, 28)

Similarly, I'd also like to do the same but with a Ticker filter too. So Date == current_date and Ticker == 'AAPL' for example.

When I try setting this up I am struggling with different errors. I've tried a few things. Could you point me in the right direction please? Many thanks!

Here's what I've tried:


date_limit = datetime.now() - timedelta(days=90)
mask_date_limit = (df_all_signals.reset_index(DATE)[DATE] == date_limit)
mask = (df_all_signals[CURRENT_RATIO] > mask_current_ratio)
mask &= (df_all_signals[ROE] > mask_roe)
mask &= mask_date_limit

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-32-32feb6bad999> in <module>()
     13 mask &= (df_all_signals[ROA] > mask_roa)
     14 mask &= (df_all_signals[ROE] > mask_roe)
---> 15 mask &= mask_date_limit

9 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in _join_level(self, other, level, how, return_indexers, keep_order)
   3692         if not right.is_unique:
   3693             raise NotImplementedError(
-> 3694                 "Index._join_level on non-unique index is not implemented"
   3695             )
   3696 

NotImplementedError: Index._join_level on non-unique index is not implemented
thf24 commented 3 years ago

Hi,

I am not a total Pandas wizard myself, but I think you have to adjust two things: reset the index on all masks and use boolean indexing on the dataframe that you want to filter. So something like:

date_limit = datetime.now() - timedelta(days=90)
mask_date_limit = (df_all_signals.reset_index()[DATE] == date_limit)
mask = (df_all_signals.reset_index()[CURRENT_RATIO] > mask_current_ratio)
mask &= (df_all_signals.reset_index()[ROE] > mask_roe)
mask &= mask_date_limit

df_filtered = df_all_signals[mask.tolist()]

Let me know if this worked.

thf24 commented 3 years ago

closing this - pls open a new issue if you still have problems.