CamDavidsonPilon / lifetimes

Lifetime value in Python
MIT License
1.45k stars 374 forks source link

My clv is coming wrong. It comes negative for many customers. Can you please suggest. #146

Closed jasminesethi11 closed 5 years ago

jasminesethi11 commented 6 years ago
from lifetimes import GammaGammaFitter
ggf = GammaGammaFitter(penalizer_coef = 0)
ggf.fit(df['x'], df['Margin'])
df['clv'] = 0.0
results = {}
results_pnbd_p  = {}
for i in prodname:
    temp= df[df["PROD_MODEL"]==i]
    ggf = GammaGammaFitter(penalizer_coef = 0)
    ggf.fit(temp['x'], temp['Margin'])
    pnbd = ParetoNBDFitter()
    mod = pnbd.fit(temp['x'], temp['t_x'], temp['T'])
    results[i] = mod.params_.values()

    try:
      temp['clv'] = ggf.customer_lifetime_value(mod, #the model to use to predict the number of future transactions
                                                      temp['x'], temp['t_x'], temp['T'], temp['Margin'],
                                                      time=12, # months
                                                      discount_rate=0.1)
    except:
      continue
    print i
    print results
    for j in temp.index:
      df.ix[j,'clv'] = temp['clv'][j]
CamDavidsonPilon commented 6 years ago

Are you able to provide the dataset as well?

stochastic1 commented 6 years ago

Is there anything in common for the customers evaluated with a negative CLV? I've run into problems when Tenure = Recency on the BGF model.

jasminesethi11 commented 6 years ago

@CamDavidsonPilon

i was trying on this dataset. clv_error.xlsx

I find parameters for my each Product line i.e. PL and then use that for finding CLV for customers who bought that PL.

jasminesethi11 commented 6 years ago

@CamDavidsonPilon any suggestions

stochastic1 commented 6 years ago

@jasminesethi11, your data is very irregular. What are you using for your time periods?

stochastic1 commented 6 years ago

@jasminesethi11 , the gammagammafitter and other parts of the codebase have since been updated. Can you try again with the updated codebase?

Trollgeir commented 5 years ago

I'm encountering the same problem, although for me it's for all customers with frequency = 0.

It occurs when the GammaGammaFitter's fitted parameter q < 1, causing the population_mean to be negative.

<lifetimes.GammaGammaFitter: fitted with 86725 subjects, p: 15.20, q: 0.58, v: 17.24>

population_mean = v * p / (q - 1)

-631.1499946276026
vruvora commented 5 years ago

I am having a similar issue -- what's the intuition behind q < 1 which is going to break the Gamma-Gamma model? Have people figured out a fix?

CamDavidsonPilon commented 5 years ago

Though I don't know a fix right now, I'll make some changes to be added to #220 v0.10.0

CamDavidsonPilon commented 5 years ago

@vruvora are you able to supply a dataset I can test against?

CamDavidsonPilon commented 5 years ago

So looking into the derivation, technically the inverse gamma has no mean for q less than 1 - so I don't really know how to interpret this yet.

vruvora commented 5 years ago

@CamDavidsonPilon Unfortunately not. But since scipy.minimize accepts constraint arguments, does it make sense to have a constraint for q > 1?

CamDavidsonPilon commented 5 years ago

Maybe? That means inference elsewhere would suffer. That may be the solution I go with though. I'll do some testing.

vruvora commented 5 years ago

@CamDavidsonPilon So the way the scipy API is designed for the constraint-based or bound based algorithms is a default argument in minimize so when you give it as one of the kwargs which get fed into options, you get the following error. Is there a way to solve this without changing the scipy source code? Note. The second image is me hacking the scipy source code which is not recommended :\

image

image

CamDavidsonPilon commented 5 years ago

Yea, we can make the negative log-likelihood some incredibly large number when q < 1. We already do something similar: https://github.com/CamDavidsonPilon/lifetimes/blob/master/lifetimes/fitters/gamma_gamma_fitter.py#L55

vruvora commented 5 years ago

Interesting. So there shouldn't be a reason to have this issue? Or we should not be hand coding constraints to fix this? For my problem this constraint adding fixes the problem.

CamDavidsonPilon commented 5 years ago

We already do something similar: https://github.com/CamDavidsonPilon/lifetimes/blob/master/lifetimes/fitters/gamma_gamma_fitter.py#L55

I mean, we already check for non-positive values, not less than 1. We would need to add a < 1 constraint (assuming we think it's a good idea)

vruvora commented 5 years ago

Do you think is there a more elegant way to design the API so that you can just hardcode the constraints as kwargs?

CamDavidsonPilon commented 5 years ago

Do do you want to sometimes allow q < 1, and othertimes not? And setting that flag in fit?

vruvora commented 5 years ago

Actually, so looking more into it, I think giving hard-constraints might be a bit harder because you have to fiddle with scipy source code so I like the idea of blowing up negative log likelihood if it's q < 1. I just updated and it gives sensible results.

CamDavidsonPilon commented 5 years ago

Feel free to PR to branch v0.10.0 if you want it in the latest release - otherwise I'll add something later this week.

vruvora commented 5 years ago

@CamDavidsonPilon Apologies for the delay. Here it is: https://github.com/CamDavidsonPilon/lifetimes/pull/236

eromoe commented 2 years ago

Add q_constraint=True got

gamma_gamma_fitter.py:156: RuntimeWarning: divide by zero encountered in double_scalars
  population_mean = v * p / (q - 1)