quantopian / alphalens

Performance analysis of predictive (alpha) stock factors
http://quantopian.github.io/alphalens
Apache License 2.0
3.29k stars 1.14k forks source link

invalid index to scalar variable @ mode(days_diffs).mode[0] #404

Open BCAA50000 opened 8 months ago

BCAA50000 commented 8 months ago

Problem Summary: I tried multiple writer's code, always the same error. Error Trace:alphalens.utils - > get_clean_factor_and_forward_returns -> compute_forward_returns 的 mode(days_diffs).mode[0] : IndexError: invalid index to scalar variable

`# -- coding: utf-8 -- import alphalens import pandas as pd import random import warnings warnings.filterwarnings('ignore')

if name == 'main':

模拟的交易日期序列

trade_date_ls = pd.date_range('1/1/2010', '31/3/2020').tolist()
# 模拟的股票代码序列
stock_id_ls = [f"{'0' * (6 - len(str(i)))}{i}.SZ" for i in range(2000)]

# 输入因子矩阵
factor_ls = []
for trade_date in trade_date_ls:
    for stock_id in stock_id_ls:
        factor_ls.append([trade_date, stock_id, random.random() / 100])
factor = pd.DataFrame(factor_ls, columns=['trade_date', 'stock_id', 'factor1'])
factor = factor.set_index(['trade_date', 'stock_id'])

# 输入价格矩阵
prices_ls = []
for trade_date in trade_date_ls:
    tmp = [random.random() / 100 for _ in range(len(stock_id_ls))]
    tmp.append(trade_date)
    prices_ls.append(tmp)
prices = pd.DataFrame(prices_ls, columns=['trade_date' if i == len(stock_id_ls) else stock_id_ls[i] for i in range(len(stock_id_ls) + 1)])
prices = prices.set_index(['trade_date'])

# periods表示调仓周期
# bins表示分组数量
input_df = alphalens.utils.get_clean_factor_and_forward_returns(factor, prices, periods=(1, 5, ), bins=10, quantiles=None)

alphalens.tears.create_information_tear_sheet(input_df)
alphalens.tears.create_returns_tear_sheet(input_df)

`

traceback: Traceback (most recent call last): File "D:\Quant\Projects\ALPHALENS_TEST\CSDN范例.py", line 36, in input_df = alphalens.utils.get_clean_factor_and_forward_returns(factor, prices, periods=(1, 5, ), bins=10, quantiles=None) File "D:\Quant\Projects\ALPHALENS_TEST\venv\lib\site-packages\alphalens\utils.py", line 827, in get_clean_factor_and_forward_returns forward_returns = compute_forward_returns( File "D:\Quant\Projects\ALPHALENS_TEST\venv\lib\site-packages\alphalens\utils.py", line 319, in compute_forward_returns delta_days = period_len.components.days - mode(days_diffs).mode[0] IndexError: invalid index to scalar variable.



**Please provide any additional information below:**
The same code can run on my friends environment.

## Versions
python3.10,
Name: alphalens
Version: 0.4.0
Name: pandas
Version: 1.5.3
Name: numpy
Version: 1.23.1
JiwenZ commented 8 months ago

set keepdims=True delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0]

BCAA50000 commented 7 months ago

set keepdims=True delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0]

Thanks JiwenZ! I haven't try out your solution, but I solved by re-install different version of Pandas or Numpy or Python, i don't know which of it is the factor. Here is a list of enviroment version, which i tested, it can run well: alphalens 0.4.0 numpy 1.24.4 pandas 1.3.4 Python 3.8.18

ljztrust commented 7 months ago

你好,我现在也碰到了跟你一模一样的问题,将factor和price数据表都准备好了,运行get_clean_factor_and_forward_returns函数总是报错IndexError: invalid index to scalar variable,换了好多个数据表,都报这个错,这只能通过降低版本来解决吗?

BCAA50000 commented 7 months ago

似乎作者有其他的办法,但我是通过版本解决的。我的环境如下。 不过呢,环境也不一定能行。我最开始出现这个问题的时候我和别人对其了环境,可是很奇怪,版本都对齐了,我还是不能运行。后面我又找了个人对齐了环境,就能运行了,最后我能用的环境如下: alphalens 0.4.0 numpy 1.24.4 pandas 1.3.4 Python 3.8.18

shandonguzi commented 7 months ago

pandas 1.4.4 also works

liangcaihua commented 6 months ago

changed to mode([days_diffs]).mode can also 真 无语

AnthonyTremblayy commented 6 months ago

I am having the same error. Changing to: delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0] or mode([days_diffs]).mode did not work. I get the following error:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Any idea how to solve?

liangcaihua commented 6 months ago

I am having the same error. Changing to: delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0] or mode([days_diffs]).mode did not work. I get the following error:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Any idea how to solve?

Your mistake is different from mine. Try something else

schweik6 commented 5 months ago

I am having the same error. Changing to: delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0] or mode([days_diffs]).mode did not work. I get the following error:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Any idea how to solve?

I modified like that and also found that error, that seems, pandas version should < 2.1(or fix source code too...). And when I use version 2.0.2, it will raise another error with "TypeError: incompatible index of inserted column with frame index"...

Finally, I use 1.3.4, works fine.

AnthonyTremblayy commented 5 months ago

Make sure you’re on alphalens-reloaded and not alphalens (not supported anymore). The former supports the latest version of pandas, but the latter doesn’t.

schweik6 commented 5 months ago

Make sure you’re on alphalens-reloaded and not alphalens (not supported anymore). The former supports the latest version of pandas, but the latter doesn’t.

You're right, I'm just on alphalens, and I find the forked project alphalens-reloaded now, will try that in future, thx.