quantopian / pyfolio

Portfolio and risk analytics in Python
https://quantopian.github.io/pyfolio
Apache License 2.0
5.62k stars 1.76k forks source link

ValueError: Found input variables with inconsistent numbers of samples: [127, 185] #486

Closed fccoelho closed 6 years ago

fccoelho commented 6 years ago

I am trying to run the various create_*_tear_sheet, but it always failes with this error.

It seems to work with the return series dowloaded by Pyfolio as shown in the tutorial.

here is my code:

pf.create_returns_tear_sheet(erd)

and here is the error:

ValueError                                Traceback (most recent call last)
<ipython-input-60-f5f00f04b0f4> in <module>()
      1 out_of_sample = erd.index[-10]
----> 2 pf.create_returns_tear_sheet(erd)#,  live_start_date=out_of_sample)

/usr/local/lib/python3.6/dist-packages/pyfolio/plotting.py in call_w_context(*args, **kwargs)
     50         if set_context:
     51             with plotting_context(), axes_style():
---> 52                 return func(*args, **kwargs)
     53         else:
     54             return func(*args, **kwargs)

/usr/local/lib/python3.6/dist-packages/pyfolio/tears.py in create_returns_tear_sheet(returns, positions, transactions, live_start_date, cone_std, benchmark_rets, bootstrap, return_fig)
    551 
    552     plotting.plot_rolling_fama_french(
--> 553         returns, ax=ax_rolling_risk)
    554 
    555     # Drawdowns

/usr/local/lib/python3.6/dist-packages/pyfolio/plotting.py in plot_rolling_fama_french(returns, factor_returns, rolling_window, legend_loc, ax, **kwargs)
    188         returns,
    189         factor_returns=factor_returns,
--> 190         rolling_window=rolling_window)
    191 
    192     rolling_beta.plot(alpha=0.7, ax=ax, **kwargs)

/usr/local/lib/python3.6/dist-packages/pyfolio/timeseries.py in rolling_fama_french(returns, factor_returns, rolling_window)
    591                         factor_returns.index[rolling_window:]):
    592         coeffs = linear_model.LinearRegression().fit(factor_returns[beg:end],
--> 593                                                      returns[beg:end]).coef_
    594         regression_coeffs = np.append(regression_coeffs, [coeffs], axis=0)
    595 

/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/base.py in fit(self, X, y, sample_weight)
    480         n_jobs_ = self.n_jobs
    481         X, y = check_X_y(X, y, accept_sparse=['csr', 'csc', 'coo'],
--> 482                          y_numeric=True, multi_output=True)
    483 
    484         if sample_weight is not None and np.atleast_1d(sample_weight).ndim > 1:

/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    581         y = y.astype(np.float64)
    582 
--> 583     check_consistent_length(X, y)
    584 
    585     return X, y

/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
    202     if len(uniques) > 1:
    203         raise ValueError("Found input variables with inconsistent numbers of"
--> 204                          " samples: %r" % [int(l) for l in lengths])
    205 
    206 

ValueError: Found input variables with inconsistent numbers of samples: [127, 185]

here is a description of my series:

count    721.000000
mean       0.248312
std        6.259295
min       -0.245698
25%       -0.019635
50%        0.003867
75%        0.040906
max      168.075894
Name: 5m_return, dtype: float64
mchen172 commented 6 years ago

Right, I get the same error. It seems only the validation check for "fama_french" plot fails. I have individually tried the other plots, they are ok with the dame data.

aster-anto commented 6 years ago

I seems to be getting the same on "fama_french" plot, any pointers to resolve this will be much appreciated. Thanks

twiecki commented 6 years ago

We should fail gracefully if the fama_french plot doesn't work. A try/except might already be good enough.

twiecki commented 6 years ago

PRs welcome!

fzhcary commented 6 years ago

i have same error

CooleRnax commented 6 years ago

Found input variables with inconsistent numbers of samples: [127, 183]

I'm facing the same error. Has anybody found the solution?

nemozny commented 6 years ago

+1

I even put Returns df and Benchmark_rets df to a same index, removed NaN, checked their shapes (are the same) but I always end up in this error. Is there any solution?

nemozny commented 6 years ago

Ok, I found a workaround - actually a feature of Python which is unheard of in C-family languages.

You can change an external module function to an arbitrary code. Insane, really!

So the error in create_full_tear_sheet (and create_returns_tear_sheet) is the plotting.plot_rolling_fama_french function in plotting.py.

So you can simply change the plotting.plot_rolling_fama_french function to something else. For instance: pyfolio.plotting.plot_rolling_fama_french = pyfolio.plotting.plot_returns Put this line just above create_full_tear_sheet function.

This way you will disable the faulty fama_french.

twiecki commented 6 years ago

@nemozny Or you could update to master where this function has been removed (pip install -U git+https://github.com/quantopian/pyfolio).

twiecki commented 6 years ago

Closed by https://github.com/quantopian/pyfolio/pull/536.

CoCoMilkyWay commented 6 months ago

in pyfolio's rolling_fama_french function(line 550) in timeseries.py change to this so that A, B has same dimensions:

    A = factor_returns[beg:end]
    B = returns[beg:end]
    idx = A.index.intersection(B.index)
    A = A.loc[idx]
    B = B.loc[idx]