Closed animenon closed 10 months ago
Error I see:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/core/indexes/base.py:3653, in Index.get_loc(self, key)
3652 try:
-> 3653 return self._engine.get_loc(casted_key)
3654 except KeyError as err:
File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/_libs/index.pyx:147, in pandas._libs.index.IndexEngine.get_loc()
File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/_libs/index.pyx:176, in pandas._libs.index.IndexEngine.get_loc()
File pandas/_libs/hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas/_libs/hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'customer_id'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
File ~/.pyenv/versions/anaconda3-2023.09-0/envs/marketing_env/lib/python3.11/site-packages/pymc_marketing/clv/models/beta_geo.py:116, in BetaGeoModel.__init__(self, data, model_config, sampler_config)
115 try:
--> 116 self.customer_id = data["customer_id"]
117 except KeyError:
File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/core/frame.py:3761, in DataFrame.__getitem__(self, key)
3760 return self._getitem_multilevel(key)
-> 3761 indexer = self.columns.get_loc(key)
3762 if is_integer(indexer):
File ~/.pyenv/versions/anaconda3-2023.09-0/lib/python3.11/site-packages/pandas/core/indexes/base.py:3655, in Index.get_loc(self, key)
3654 except KeyError as err:
-> 3655 raise KeyError(key) from err
3656 except TypeError:
3657 # If we have a listlike key, _check_indexing_error will raise
3658 # InvalidIndexError. Otherwise we fall through and re-raise
3659 # the TypeError.
KeyError: 'customer_id'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
Cell In[6], line 1
----> 1 beta_geo_model = clv.BetaGeoModel(data = data)
File ~/.pyenv/versions/anaconda3-2023.09-0/envs/marketing_env/lib/python3.11/site-packages/pymc_marketing/clv/models/beta_geo.py:118, in BetaGeoModel.__init__(self, data, model_config, sampler_config)
116 self.customer_id = data["customer_id"]
117 except KeyError:
--> 118 raise KeyError("customer_id column is missing from data")
119 try:
120 self.frequency = data["frequency"]
KeyError: 'customer_id column is missing from data'
Error in short: KeyError: 'customer_id column is missing from data'
Seems this dataset doesn't have the "customer_id" column which is required for the Beta Geo Model.
Setting the index as the customer_id should fix the issue given it just needs a unique identifier...
data['customer_id'] = data.index
Do you want to do a pull request :) ?
Will submit a pr.
Closed via #440
CLV Quickstart example fails at the function call:
beta_geo_model = clv.BetaGeoModel(data = data)
Not sure what I am missing here, I am on a Mac M1 and using conda to run the code from ipython.
On a side note, why doesn't the package just have a pip installable version? I am not a conda user so to just checkout the package I had to use conda.