quantopian / alphalens

Performance analysis of predictive (alpha) stock factors
http://quantopian.github.io/alphalens
Apache License 2.0
3.27k stars 1.13k forks source link

Integration with pyfolio and Quantopian's new Risk Model #225

Closed luca-s closed 6 years ago

luca-s commented 6 years ago

I have been thinking about the nice "Alpha decomposition" #180 feature. The information it provides is actually a small part of what already available in pyfolio and the Quantopian's new Risk Model. On one side we cannot replicate all the information provided by those two tools but on the other side it would be great to have all that analysis without having to build an algorithm and run a backtest, something that could be integrated into Alphalens.

Then, why don't we create a function in Alphalens that builds the input required by pyfolio and the Quantopian's new Risk Model? Alphalens already simulates the cumulative returns of a portfolio weighted by factor values, so we only need to format those information in a way that is compatible with the other two tools. That would be a pure theoretical analysis, but except for commissions and slippage the results would be realistic and it also would serve as benchmark for the algorithm (users can compare the algorithm results, after setting commission and slippage to 0, with these theoretical results and check if they implemented the algorithm correctly).

I haven't looked at pyfolio in details so I don't know the details of the input, but if @twiecki can help me with those details I can work on this feature and the same for Quantopian's new Risk Mode (I don't know if that is part of pyfolio or a separate project).

twiecki commented 6 years ago

That's really clever!

Pyfolio just requires returns and positions. I think we should be able to get both of those from alphalens pretty easily.

On Nov 19, 2017 5:45 PM, "luca-s" notifications@github.com wrote:

I have been thinking about the nice "Alpha decomposition" #180 https://github.com/quantopian/alphalens/issues/180 feature. The information it provides is actually a small part of what already available in pyfolio and the Quantopian's new Risk Model. On one side we cannot replicate all the information provided by those two tools but on the other side it would be great to have all that analysis without having to build an algorithm and run a backtest, something that could be integrated into Alphalens.

Then, why don't we create a function in Alphalens that builds the input required by pyfolio and the Quantopian's new Risk Model? Alphalens already simulates the cumulative returns of a portfolio weighted by factor values, so we only need to format those information in a way that is compatible with the other two tools. That would be a pure theoretical analysis, but expect for commissions and slippage the results would be realistic.

I haven't looked at pyfolio in details so I don't know the details of the input, but if @twiecki https://github.com/twiecki can help me with those details I can work on this feature and the same for Quantopian's new Risk Mode (I don't know if that is part of pyfolio or a separate project).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/quantopian/alphalens/issues/225, or mute the thread https://github.com/notifications/unsubscribe-auth/AApJmGdDC7US4hdozOuFuYgEx-af8Du0ks5s4FsPgaJpZM4Qjcgv .

luca-s commented 6 years ago

@twiecki just need to double check with you few details about pyfolio input:

From API documentaion:

   returns : pd.Series
        Daily returns of the strategy, noncumulative.
         - Time series with decimal returns.
         - Example:
            2015-07-16    -0.012143
            2015-07-17    0.045350
            2015-07-20    0.030957
            2015-07-21    0.004902

Do the returns need to be computed at the end of the trading day? For example, does the entry "2015-07-16 -0.012143" mean the portfolio had a return of -0.012143 at market close on 2015-07-16?

    positions : pd.DataFrame, optional
        Daily net position values.
         - Time series of dollar amount invested in each position and cash.
         - Days where stocks are not held can be represented by 0 or NaN.
         - Non-working capital is labelled 'cash'
         - Example:
            index         'AAPL'         'MSFT'          cash
            2004-01-09    13939.3800     -14012.9930     711.5585
            2004-01-12    14492.6300     -14624.8700     27.1821
            2004-01-13    -13853.2800    13653.6400      -43.6375 

Do the positions need to be computed at the end of the trading day too? Or beginning of trading day?

twiecki commented 6 years ago

We usually set everything to EOD, yes. But I don't think that assumption is really baked in anywhere.

luca-s commented 6 years ago

Submitted PR #227 where only pyfolio returns are generated, I will see how difficult it would be to generate positions and transactions DataFrames too.

luca-s commented 6 years ago

@twiecki given Alphalens is able to generate positions information other than returns, I was wandering what are the steps required to generate risk exposure analysis using pyfolio.

I believe the functions we have to call are pyfolio.tears.create_risk_tear_sheet and pyfolio.tears.create_perf_attrib_tear_sheet. is this correct? Do we need both? I am not sure what create_risk_tear_sheet would plot assuming that create_perf_attrib_tear_sheet would produce the same output as the Quantopian Research API backtest.create_perf_attrib_tear_sheet().

Then, assuming we run the code on Quantopian, we can use the Research API get_factor_returns and get_factor_loadings to fetch the risk factors required by pyfolio functions. Is this correct?

Thank you.

twiecki commented 6 years ago

@luca-s Yeah I had the same thoughts. It should be pretty easy to extract a positions dataframe from alphalens and input that into pyfolio. create_risk_tear_sheet will be removed, but create_perf_attrib_tear_sheet would support this well.

And yes, getting the factor returns and loadings would be enough.

luca-s commented 6 years ago

Thank you @twiecki , I am looking forward to see this working :)

luca-s commented 6 years ago

@twiecki one question regarding the positions dataframe. What should the dataframe contain if the positions are liquidated before market close (that is,an intraday factor) ?

twiecki commented 6 years ago

@luca-s It doesn't have to be EOD. We actually have a function that detects when the holdings where most promiment and use that time-slice for the positions df.

luca-s commented 6 years ago

Ah ok, so create_perf_attrib_tear_sheet works as the other pyfolio functions. This means we need to pass transactions data too if we want pyfolio to be able to distinguish between different types of strategies, otherwise the analysis would be the same for a portfolio that holds positions overnight and a portfolio that trades intraday.

twiecki commented 6 years ago

transactions are only required for things like trading-times, trading volume etc. I don't think it adds much here.

luca-s commented 6 years ago

If we don't provide transactions, how could pyfolio guess if the strategy is intraday or not? returns and positions provides only dates, not datetimes (at least this is what I read in the docstrings of pyfolio API)

twiecki commented 6 years ago

If just positions is provided it will assume that it's timestamped at the "right" time of day. It only tries to infer if transactions are provided. Thus, if we pass in the right positions we don't need to worry about it.

mmargenot commented 6 years ago

For positions, I think it would be useful to be able to generate both equal-weighted and factor-weighted values. Both should not be too bad, as perf_attrib can take percentages.

luca-s commented 6 years ago

I was thinking about risk factors returns and loadings outside Quantopian:

luca-s commented 6 years ago

@twiecki to compute transactions data we need price information, but if the prices are split/merge/dividend adjusted the resulting transactions will be not correct, as it contains both prices and share amount information. I don't believe it would be easy for users to provide not adjusted prices, so I was wondering if it makes sense to compute transactions at all. My fear is that most users will provide wrong prices and I don't know the implication that this would have on pyfolio results.

twiecki commented 6 years ago

Yeah, I don't think we need transactions, there aren't that many pyfolio analyses centered our transactions in any case. Although some proxy of turn-over would be interesting. Can one specify the trading frequency at all?

luca-s commented 6 years ago

Can one specify the trading frequency at all?

You mean in pyfolio?

twiecki commented 6 years ago

No, when creating the alphalens output. For example, I could have a 5-day signal that I only trade on every Monday, rather than trading it every day, which would incur high transaction costs.

luca-s commented 6 years ago

You pass Alphalens the data you want to analyze and only that. In your example you have to pass Alphalens the signal data corresponding to Mondays .

EDIT: By the way, this is how the event study works. The signal is passed only for those days when... there is signal :)

twiecki commented 6 years ago

I wonder if we could nice API for that. like trade_schedule=pd.offsets.Weekday(1) or something along those lines. Under the hood we would just call .resample().

twiecki commented 6 years ago

But fair enough, a user can easily do that themself.

luca-s commented 6 years ago

I wonder if we could nice API for that. like trade_schedule=pd.offsets.Weekday(1) or something along those lines. Under the hood we would just call .resample().

I am not sure, I believe that would result in a wrapper to what Pandas already offers. I am not completely against it but I would avoid to add too many options for very little benefits.

twiecki commented 6 years ago

Yeah, maybe we just add an example of how to do that to the docs.

twiecki commented 6 years ago

But wait, the returns would look very different. And you would need to forward-fill the positions.

luca-s commented 6 years ago

But wait, the returns would look very different. And you would need to forward-fill the positions.

I am not sure I follow your point. Could you please provide some more details?

twiecki commented 6 years ago

I think I thought about this wrong. You mean to subsample the factor data you pass into alphalens, not the output you would pass into pyfolio, right? (I was thinking the latter but the former makes more sense)

luca-s commented 6 years ago

ah ok I got it, but yes, I meant the former.

twiecki commented 6 years ago

OK, that should work well then. Can you add it to the NB perhaps?

luca-s commented 6 years ago

Sure, but what kind of alternative frequency would you like to see? The same factor traded only on Monday, as your initial suggestion?

twiecki commented 6 years ago

yeah

luca-s commented 6 years ago

250 was merged, we can close this.