Closed timmy-ops closed 2 years ago
pls show a minimal copy pastable and reproducible example w/o any external dependencies
pls show a minimal copy pastable and reproducible example w/o any external dependencies
Hi jreback,
Yes I am sorry and I tried to produce one, but the problem is the whole model cannot work without this bigger dataset.
It will be difficult to determine whether there is a true bug here without a more minimal example: https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
Hey @timmy-ops ,
This is not an issue with pandas
, but rather the lifetimes
library. Please repost this issue in the lifetimes
repository.
The scipy.hyp2f1
method in the final line of your error trace is a lifetimes
dependency expecting to receive numpy
arrays as inputs. When using any of the lifetimes
modeling methods, it is important to always use a df['COL_NAME'].values
syntax in all of the arguments, otherwise hyp2f1
will receive a sliced-up Pandas dataframe and create the unstable behavior you are seeing.
Unfortunately, in the case of the lifetimes.GammaGammaFitter.customer_lifetime_value
method, Pandas slices are being used in the internal operations. It's an easy fix, but the lifetimes
project is no longer being actively maintained. Some other contributors and I are planning a Zoom meeting in a few weeks to discuss taking over development of this library. If you wish to contribute, please let us know in this issue link:
@timmy-ops did you find any solution for this? [error] Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series input
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
reproducable example
NotImplementedError Traceback (most recent call last) in ()
58 summary['monetary_value'],
59 time=time_months,
---> 60 discount_rate=discount_rate)
61
62
6 frames /usr/local/lib/python3.7/dist-packages/lifetimes/fitters/gamma_gamma_fitter.py in customer_lifetime_value(self, transaction_prediction_model, frequency, recency, T, monetary_value, time, discount_rate, freq) 294 295 return _customer_lifetime_value( --> 296 transaction_prediction_model, frequency, recency, T, adjusted_monetary_value, time, discount_rate, freq=freq 297 )
/usr/local/lib/python3.7/dist-packages/lifetimes/utils.py in _customer_lifetime_value(transaction_prediction_model, frequency, recency, T, monetary_value, time, discount_rate, freq) 496 # since the prediction of number of transactions is cumulative, we have to subtract off the previous periods 497 expected_number_of_transactions = transaction_prediction_model.predict( --> 498 i, frequency, recency, T 499 ) - transaction_prediction_model.predict(i - factor, frequency, recency, T) 500 # sum up the CLV estimates of all of the periods and apply discounted cash flow
/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/pareto_nbd_fitter.py in conditional_expected_number_of_purchases_up_to_time(self, t, frequency, recency, T) 277 r, alpha, s, beta = params 278 --> 279 likelihood = self._conditional_log_likelihood(params, x, t_x, T) 280 first_term = ( 281 gammaln(r + x) - gammaln(r) + r log(alpha) + s log(beta) - (r + x) log(alpha + T) - s log(beta + T)
/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/pareto_nbd_fitter.py in _conditional_log_likelihood(params, freq, rec, T) 212 213 A_1 = gammaln(r + x) - gammaln(r) + r log(alpha) + s log(beta) --> 214 log_A_0 = ParetoNBDFitter._log_A_0(params, x, rec, T) 215 216 A_2 = logaddexp(-(r + x) log(alpha + T) - s log(beta + T), log(s) + log_A_0 - log(r_s_x))
/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/pareto_nbd_fitter.py in _log_A_0(params, freq, recency, age) 179 180 rsf = r + s + freq --> 181 p_1 = hyp2f1(rsf, t, rsf + 1.0, abs_alpha_beta / (max_of_alpha_beta + recency)) 182 q_1 = max_of_alpha_beta + recency 183 p_2 = hyp2f1(rsf, t, rsf + 1.0, abs_alpha_beta / (max_of_alpha_beta + age))
/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in __array_ufunc__(self, ufunc, method, *inputs, kwargs) 2030 self, ufunc: np.ufunc, method: str, *inputs: Any, *kwargs: Any 2031 ): -> 2032 return arraylike.array_ufunc(self, ufunc, method, inputs, kwargs) 2033 2034 # ideally we would define this to avoid the getattr checks, but
/usr/local/lib/python3.7/dist-packages/pandas/core/arraylike.py in array_ufunc(self, ufunc, method, *inputs, **kwargs) 292 raise NotImplementedError( 293 "Cannot apply ufunc {} to mixed DataFrame and Series " --> 294 "inputs.".format(ufunc) 295 ) 296 axes = self.axes
NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs.