Attempting to process some data, I passed my column names into the ZRP constructor, but then got this error:
####################################
Processing rows: 0:25000
####################################
Data is loaded
[Start] Validating input data
Traceback (most recent call last):
[[ boring stack frames from my app elided for brevity ]]
File "/home/egnor/source/rcv/ea_race/ea_zrp.py", line 58, in main
output_df = predictor.transform(ea_df)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/egnor/source/rcv/python_venv/lib/python3.11/site-packages/zrp/zrp.py", line 172, in transform
prepared_data_chunk = z_prepare.transform(data_chunk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/egnor/source/rcv/python_venv/lib/python3.11/site-packages/zrp/prepare/prepare.py", line 84, in transform
gen_process.fit(data)
File "/home/egnor/source/rcv/python_venv/lib/python3.11/site-packages/zrp/prepare/preprocessing.py", line 392, in fit
raise ValueError(f" Missing required data {val_na}")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Missing required data ['AddressLine1', 'First', 'Last', 'Mid', 'State/Province', 'Zip4', 'house_number']
Most of those columns DID exist in the original dataframe, but they got renamed by rename_data_columns. However, in the ZRP class, the original kwargs are saved in self.params_dict. After the rename, that params dict USED to be cleared, but that call was commented out, because not all params are column definitions.
Steps To Reproduce
Use ZRP with nonstandard column names, passing those names into the ZRP() constructor, attempt to make predictions.
What browsers are you seeing the problem on?
No response
Environment
- OS: Linux (Ubuntu 23.10)
- Python: 3.11.8
- ZRP: git main as of this bug report
Anything else?
No response
Code of Conduct
[X] I've read the Code of Conduct and understand my responsibilities as a member of the Virtual Coffee community
Is there an existing issue for this?
What happened?
Attempting to process some data, I passed my column names into the
ZRP
constructor, but then got this error:Most of those columns DID exist in the original dataframe, but they got renamed by
rename_data_columns
. However, in theZRP
class, the originalkwargs
are saved inself.params_dict
. After the rename, that params dict USED to be cleared, but that call was commented out, because not all params are column definitions.Steps To Reproduce
Use ZRP with nonstandard column names, passing those names into the
ZRP()
constructor, attempt to make predictions.What browsers are you seeing the problem on?
No response
Environment
Anything else?
No response
Code of Conduct