quantopian / alphalens

Performance analysis of predictive (alpha) stock factors
http://quantopian.github.io/alphalens
Apache License 2.0
3.26k stars 1.13k forks source link

Add option for "zero aware" quantiles/bins #281

Closed luca-s closed 6 years ago

luca-s commented 6 years ago

Sometimes a user might want to analyze a factor where positive and negative values are treated separately. Since the amount of positive values might be different than the negative ones (the factor is biased) the quantiles option wouldn't do a good job in separating positive from negative values. Currently the only way to separate positive from negative values is to use the bins option, providing custom bin ranges. Is this enough or do we want to improve it?

twiecki commented 6 years ago

I think an option to use the quantile method separately for positive and negative values would be a useful feature.

luca-s commented 6 years ago

Do you have a suggestion for option name and description? :)

twiecki commented 6 years ago

split_pos_neg?

twiecki commented 6 years ago

"Compute quantile boundaries separately for positive and negative signal values. This is useful if your signal is centered and zero is the separation between long and short signals, respectively."

twiecki commented 6 years ago

@luca-s Did you plan to add that in?

luca-s commented 6 years ago

I am not planning on working on this soon. The change is trivial but there are API change and unit testing updates involved, which takes time. Anybody who feels like working on this is welcome.

eigenfoo commented 6 years ago

Hi! I can take this over. Perhaps by_sign would be a better name? It looks like we already have a by_group parameter, which has a similar functionality: compute quantile buckets separately for each group. We'd like to compute quantile buckets separately for positive/negative values, so it seems logical to have a similar name.

@twiecki thoughts?

twiecki commented 6 years ago

hmm, I like the idea just not sure that logic becomes apparent to users. zero_aware feels a bit more intuitive to me. Thoughts @luca-s?

eigenfoo commented 6 years ago

Ah, actually, reading the code clarified this for me a lot. I take back what I said: I didn't realize that zero_aware was only to be used for quantiles, not bins. I think it's a good name. We just need to document that the parameter will be ignored if the user is using bins.

luca-s commented 6 years ago

zero_aware seems good to me too

luca-s commented 6 years ago

Closed by #306