vanderschaarlab / temporai

TemporAI: ML-centric Toolkit for Medical Time Series
https://www.temporai.vanderschaar-lab.com/
Apache License 2.0
97 stars 19 forks source link

[Bug] <static data imputation issue> #113

Open EvanWu19 opened 9 months ago

EvanWu19 commented 9 months ago

Describe the bug I created a TemporalPredictionDataset according to the tutorial. However, when I tried to do the static data imputation, there are always error reported. I tried different static_imputer: "mean", "MissForest", and"HypterImput, but they all gave me the same error message. I followed your imputation tutorial, with following code:

from tempor import plugin_loader

dataset = my_datasource(with_missing=True, random_state=42).load() print(dataset)

model = plugin_loader.get("preprocessing.imputation.static.static_tabular_imputer", static_imputer="mean") print(model)

Note missingness in static data.

print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore

dataset.static

Note no more missingness in static data.

dataset = model.fit_transform(dataset) # Or call fit() then transform().

print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore

dataset.static


TypeError Traceback (most recent call last) Cell In[59], line 3 1 # Note no more missingness in static data. ----> 3 dataset = model.fit_transform(dataset) # Or call fit() then transform(). 5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore 7 dataset.static

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments..validate..wrapper_function(*args, kwargs) 53 @wraps(_func) 54 def wrapper_function(*args: Any, *kwargs: Any) -> Any: ---> 55 return vd.call(args, kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, kwargs) 148 def call(self, *args: Any, *kwargs: Any) -> Any: 149 m = self.init_model_instance(args, kwargs) --> 150 return self.execute(m)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m) 220 return self.rawfunction(*args, kwargs, var_kwargs) 221 else: --> 222 return self.raw_function(d, var_kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tempor\methods\core_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, kwargs) 53 """Fit the method to the data and transform it. Equivalent to calling fit and then transform. 54 55 Args: (...) 61 dataset.BaseDataset: The transformed dataset. 62 """ 63 self.fit(data, *args, *kwargs) ---> 64 return self.transform(data, args, kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tempor\methods\core_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, *kwargs) 31 """Transforms the given data. 32 33 Args: (...) 39 Any: The transformed data. 40 """ 41 logger.debug(f"Calling _transform() implementation on {self.class.name}") ---> 42 transformed_data = self._transform(data, args, **kwargs) 44 return transformed_data

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs) 80 if data.static is not None: 81 static_data = data.static.dataframe() ---> 82 imputed_static_data = self.imputer.transform(static_data) 83 imputed_static_data.columns = static_data.columns 84 imputed_static_data.index = static_data.index

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X) 130 def transform(self, X: pd.DataFrame) -> pd.DataFrame: 131 X = cast.to_dataframe(X) --> 132 return pd.DataFrame(self._transform(X))

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X) 85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame: ---> 86 return self._model.transform(X)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X) 130 def transform(self, X: pd.DataFrame) -> pd.DataFrame: 131 X = cast.to_dataframe(X) --> 132 return pd.DataFrame(self._transform(X))

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X) 127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame: --> 128 return self.model.fit_transform(X)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments..validate..wrapper_function(*args, kwargs) 53 @wraps(_func) 54 def wrapper_function(*args: Any, *kwargs: Any) -> Any: ---> 55 return vd.call(args, kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, kwargs) 148 def call(self, *args: Any, *kwargs: Any) -> Any: 149 m = self.init_model_instance(args, kwargs) --> 150 return self.execute(m)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m) 220 return self.rawfunction(*args, kwargs, var_kwargs) 221 else: --> 222 return self.raw_function(d, var_kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers_hyperimpute_internals.py:918, in IterativeErrorCorrection.fit_transform(self, X) 915 @validate_arguments(config=dict(arbitrary_types_allowed=True)) 916 def fit_transform(self, X: pd.DataFrame) -> pd.DataFrame: 917 # Run imputation --> 918 X = self._setup(X) 920 Xt_init = self._initial_imputation(X) 921 Xt_init.columns = X.columns

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments..validate..wrapper_function(*args, kwargs) 53 @wraps(_func) 54 def wrapper_function(*args: Any, *kwargs: Any) -> Any: ---> 55 return vd.call(args, kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, kwargs) 148 def call(self, *args: Any, *kwargs: Any) -> Any: 149 m = self.init_model_instance(args, kwargs) --> 150 return self.execute(m)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m) 220 return self.rawfunction(*args, kwargs, var_kwargs) 221 else: --> 222 return self.raw_function(d, var_kwargs)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers_hyperimpute_internals.py:703, in IterativeErrorCorrection._setup(self, X) 700 existing_vals = X[col][X[col].notnull()] 702 le = LabelEncoder() --> 703 X.loc[X[col].notnull(), col] = le.fit_transform(existing_vals).astype( 704 int 705 ) 706 self.encoders[col] = le 708 self.limits = {}

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:885, in _LocationIndexer.setitem(self, key, value) 882 self._has_valid_setitem_indexer(key) 884 iloc = self if self.name == "iloc" else self.obj.iloc --> 885 iloc._setitem_with_indexer(indexer, value, self.name)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1893, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name) 1890 # align and set the values 1891 if take_split_path: 1892 # We have to operate column-wise -> 1893 self._setitem_with_indexer_split_path(indexer, value, name) 1894 else: 1895 self._setitem_single_block(indexer, value, name)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1937, in _iLocIndexer._setitem_with_indexer_split_path(self, indexer, value, name) 1933 self._setitem_with_indexer_2d_value(indexer, value) 1935 elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi): 1936 # We are setting multiple rows in a single column. -> 1937 self._setitem_single_column(ilocs[0], value, pi) 1939 elif len(ilocs) == 1 and 0 != lplane_indexer != len(value): 1940 # We are trying to set N values into M entries of a single 1941 # column, which is invalid for N != M 1942 # Exclude zero-len for e.g. boolean masking that is all-false 1944 if len(value) == 1 and not is_integer(info_axis): 1945 # This is a case like df.iloc[:3, [1]] = [0] 1946 # where we treat as df.iloc[:3, 1] = 0

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:2095, in _iLocIndexer._setitem_single_column(self, loc, value, plane_indexer) 2091 self.obj.isetitem(loc, value) 2092 else: 2093 # set value into the column (first attempting to operate inplace, then 2094 # falling back to casting if necessary) -> 2095 self.obj._mgr.column_setitem(loc, plane_indexer, value) 2097 self.obj._clear_item_cache()

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\managers.py:1308, in BlockManager.column_setitem(self, loc, idx, value, inplace_only) 1306 col_mgr.setitem_inplace(idx, value) 1307 else: -> 1308 new_mgr = col_mgr.setitem((idx,), value) 1309 self.iset(loc, new_mgr._block.values, inplace=True)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\managers.py:399, in BaseBlockManager.setitem(self, indexer, value) 395 # No need to split if we either set all columns or on a single block 396 # manager 397 self = self.copy() --> 399 return self.apply("setitem", indexer=indexer, value=value)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\managers.py:354, in BaseBlockManager.apply(self, f, align_keys, kwargs) 352 applied = b.apply(f, kwargs) 353 else: --> 354 applied = getattr(b, f)(**kwargs) 355 result_blocks = extend_blocks(applied, result_blocks) 357 out = type(self).from_blocks(result_blocks, self.axes)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\blocks.py:1758, in EABackedBlock.setitem(self, indexer, value, using_cow) 1755 check_setitem_lengths(indexer, value, values) 1757 try: -> 1758 values[indexer] = value 1759 except (ValueError, TypeError) as err: 1760 _catch_deprecated_value_error(err)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\arrays_mixins.py:253, in NDArrayBackedExtensionArray.setitem(self, key, value) 251 def setitem(self, key, value) -> None: 252 key = check_array_indexer(self, key) --> 253 value = self._validate_setitem_value(value) 254 self._ndarray[key] = value

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\arrays\categorical.py:1560, in Categorical._validate_setitem_value(self, value) 1557 def _validate_setitem_value(self, value): 1558 if not is_hashable(value): 1559 # wrap scalars and hashable-listlikes in list -> 1560 return self._validate_listlike(value) 1561 else: 1562 return self._validate_scalar(value)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\arrays\categorical.py:2277, in Categorical._validate_listlike(self, value) 2274 # no assignments of values not in categories, but it's always ok to set 2275 # something to np.nan 2276 if len(to_add) and not isna(to_add).all(): -> 2277 raise TypeError( 2278 "Cannot setitem on a Categorical with a new " 2279 "category, set the categories first" 2280 ) 2282 codes = self.categories.get_indexer(value) 2283 return codes.astype(self._ndarray.dtype, copy=False)

TypeError: Cannot setitem on a Categorical with a new category, set the categories first

Desktop (please complete the following information):

Please help me understand how to solve this. I appreciate your hard word on developing such great tool. I hope I can use it in my work.

DrShushen commented 9 months ago

Thank you for reporting this issue! We will look into fixing this.

In the first instance, could you please try installing temporai in a conda environment with python 3.10 (rather than 3.11, which we haven't tested fully yet)?

https://conda.io/projects/conda/en/latest/user-guide/install/windows.html

then:

conda create -n temporai-env python=3.10 -y
conda activate temporai-env
pip install temporai

And let us know if the problem still happens.

EvanWu19 commented 8 months ago

Thank you for your feedback. Now, I have used conda environment with python 3.10, and reran my code. The issue is still existing. I tried to figure out what happened by shrinking the static part of the data frame to one column. IF the column is numeric:

My code to show the data

missing_in_rows = dataset.static.dataframe().isnull().any(axis=1)
has_missing_values = missing_in_rows.any()
num_rows_with_missing = missing_in_rows.sum()
rows_with_missing = dataset.static.dataframe()[missing_in_rows]
print("Missing value count:", dataset.static.dataframe().isnull().sum().sum())  # type: ignore
print("Missing value rows:", rows_with_missing)
dataset.static

The result showed as below:

Missing value count: 16 Missing value rows: MedianIncomePerACS sample_idx
706 NaN 794 NaN 756 NaN 764 NaN 740 NaN 774 NaN 666 NaN 778 NaN 714 NaN 772 NaN 728 NaN 788 NaN 772 NaN 798 NaN 718 NaN 728 NaN

StaticSamples with data:

sample_idx MedianIncomePerACS 714 101563.0 722 51122.0 762 50220.0 714 72724.0 772 54835.0 ... ... 726 163170.0 734 75256.0 736 48750.0 710 99697.0 704 73377.0

828 rows × 1 columns

model = plugin_loader.get("preprocessing.imputation.static.static_tabular_imputer", static_imputer="HyperImpute")
dataset = model.fit_transform(dataset)  # Or call fit() then transform().

print("Missing value count:", dataset.static.dataframe().isnull().sum().sum())  # type: ignore

dataset.static

I got following error:

2024-01-23 13:30:12 | INFO     | hyperimpute.logger:log_and_print:65 |   > HyperImpute using inner optimization
2024-01-23 13:30:12 | INFO     | hyperimpute.logger:log_and_print:65 |   > Imputation iter 0
2024-01-23 13:30:12 | ERROR    | hyperimpute.logger:log_and_print:65 |       >>> MedianIncomePerACS:linear_regression:2 folds eval failed at least one array or dtype is required
2024-01-23 13:30:12 | ERROR    | hyperimpute.logger:log_and_print:65 |       >>> MedianIncomePerACS:linear_regression:1 folds eval failed at least one array or dtype is required
2024-01-23 13:30:12 | INFO     | hyperimpute.logger:log_and_print:65 |      >>> Column MedianIncomePerACS <-- score -9999999 <-- Model linear_regression
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[18], line 3
      1 # Note no more missingness in static data.
----> 3 dataset = model.fit_transform(dataset)  # Or call fit() then transform().
      5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum())  # type: ignore
      7 dataset.static

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, **kwargs)
     53 """Fit the method to the data and transform it. Equivalent to calling ``fit`` and then ``transform``.
     54 
     55 Args:
   (...)
     61     dataset.BaseDataset: The transformed dataset.
     62 """
     63 self.fit(data, *args, **kwargs)
---> 64 return self.transform(data, *args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, **kwargs)
     31 """Transforms the given data.
     32 
     33 Args:
   (...)
     39     Any: The transformed data.
     40 """
     41 logger.debug(f"Calling _transform() implementation on {self.__class__.__name__}")
---> 42 transformed_data = self._transform(data, *args, **kwargs)
     44 return transformed_data

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs)
     80 if data.static is not None:
     81     static_data = data.static.dataframe()
---> 82     imputed_static_data = self.imputer.transform(static_data)
     83     imputed_static_data.columns = static_data.columns
     84     imputed_static_data.index = static_data.index

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
    130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
    131     X = cast.to_dataframe(X)
--> 132     return pd.DataFrame(self._transform(X))

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X)
     85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
---> 86     return self._model.transform(X)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
    130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
    131     X = cast.to_dataframe(X)
--> 132     return pd.DataFrame(self._transform(X))

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X)
    127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
--> 128     return self.model.fit_transform(X)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:925, in IterativeErrorCorrection.fit_transform(self, X)
    921 Xt_init.columns = X.columns
    923 Xt = Xt_init.copy()
--> 925 Xt = self._fit_transform_inner_optimization(Xt)
    927 return self._tear_down(Xt)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:894, in IterativeErrorCorrection._fit_transform_inner_optimization(self, X)
    892 for col in cols:
    893     obj_score += self._optimize_model_for_column(X, col)
--> 894     X = self._impute_single_column(X.copy(), col, True)
    896 inf_norm = np.linalg.norm(X - X_prev, ord=np.inf, axis=None)
    897 if inf_norm < INNER_TOL and it > self.n_min_inner_iter:

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:833, in IterativeErrorCorrection._impute_single_column(self, X, col, train)
    830 est = self.column_to_model[col]
    832 if train:
--> 833     est.fit(X_train, y_train)
    835 X[col][self.mask[col]] = est.predict(covs[self.mask[col]]).values.squeeze()
    837 col_min, col_max = self.limits[col]

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\prediction\regression\base.py:49, in RegressionPlugin.fit(self, X, *args, **kwargs)
     46     raise ValueError("Invalid input for fit. Expecting X and Y.")
     48 X = cast.to_dataframe(X)
---> 49 self._fit(X, *args, **kwargs)
     51 return self

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\prediction\regression\plugin_linear_regression.py:59, in LinearRegressionPlugin._fit(self, X, *args, **kwargs)
     56 def _fit(
     57     self, X: pd.DataFrame, *args: Any, **kwargs: Any
     58 ) -> "LinearRegressionPlugin":
---> 59     self.model.fit(X, *args, **kwargs)
     60     return self

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\base.py:1351, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
   1344     estimator._validate_params()
   1346 with config_context(
   1347     skip_parameter_validation=(
   1348         prefer_skip_nested_validation or global_skip_validation
   1349     )
   1350 ):
-> 1351     return fit_method(estimator, *args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\linear_model\_ridge.py:1153, in Ridge.fit(self, X, y, sample_weight)
   1133 """Fit Ridge regression model.
   1134 
   1135 Parameters
   (...)
   1150     Fitted estimator.
   1151 """
   1152 _accept_sparse = _get_valid_accept_sparse(sparse.issparse(X), self.solver)
-> 1153 X, y = self._validate_data(
   1154     X,
   1155     y,
   1156     accept_sparse=_accept_sparse,
   1157     dtype=[np.float64, np.float32],
   1158     multi_output=True,
   1159     y_numeric=True,
   1160 )
   1161 return super().fit(X, y, sample_weight=sample_weight)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\base.py:650, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
    648         y = check_array(y, input_name="y", **check_y_params)
    649     else:
--> 650         X, y = check_X_y(X, y, **check_params)
    651     out = X, y
    653 if not no_val_X and check_params.get("ensure_2d", True):

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\utils\validation.py:1192, in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
   1187         estimator_name = _check_estimator_name(estimator)
   1188     raise ValueError(
   1189         f"{estimator_name} requires y to be passed, but the target y is None"
   1190     )
-> 1192 X = check_array(
   1193     X,
   1194     accept_sparse=accept_sparse,
   1195     accept_large_sparse=accept_large_sparse,
   1196     dtype=dtype,
   1197     order=order,
   1198     copy=copy,
   1199     force_all_finite=force_all_finite,
   1200     ensure_2d=ensure_2d,
   1201     allow_nd=allow_nd,
   1202     ensure_min_samples=ensure_min_samples,
   1203     ensure_min_features=ensure_min_features,
   1204     estimator=estimator,
   1205     input_name="X",
   1206 )
   1208 y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric, estimator=estimator)
   1210 check_consistent_length(X, y)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\utils\validation.py:833, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
    829 pandas_requires_conversion = any(
    830     _pandas_dtype_needs_early_conversion(i) for i in dtypes_orig
    831 )
    832 if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig):
--> 833     dtype_orig = np.result_type(*dtypes_orig)
    834 elif pandas_requires_conversion and any(d == object for d in dtypes_orig):
    835     # Force object if any of the dtypes is an object
    836     dtype_orig = object

ValueError: at least one array or dtype is required

However, when I tried to add a categorical column in static data, without any missing value:

Missing value count: 0 Missing value rows: Empty DataFrame Columns: [A_Patient_dx] Index: []

I got following error, when I tried to impute it:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[26], line 3
      1 # Note no more missingness in static data.
----> 3 dataset = model.fit_transform(dataset)  # Or call fit() then transform().
      5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum())  # type: ignore
      7 dataset.static

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, **kwargs)
     53 """Fit the method to the data and transform it. Equivalent to calling ``fit`` and then ``transform``.
     54 
     55 Args:
   (...)
     61     dataset.BaseDataset: The transformed dataset.
     62 """
     63 self.fit(data, *args, **kwargs)
---> 64 return self.transform(data, *args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, **kwargs)
     31 """Transforms the given data.
     32 
     33 Args:
   (...)
     39     Any: The transformed data.
     40 """
     41 logger.debug(f"Calling _transform() implementation on {self.__class__.__name__}")
---> 42 transformed_data = self._transform(data, *args, **kwargs)
     44 return transformed_data

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs)
     80 if data.static is not None:
     81     static_data = data.static.dataframe()
---> 82     imputed_static_data = self.imputer.transform(static_data)
     83     imputed_static_data.columns = static_data.columns
     84     imputed_static_data.index = static_data.index

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
    130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
    131     X = cast.to_dataframe(X)
--> 132     return pd.DataFrame(self._transform(X))

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X)
     85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
---> 86     return self._model.transform(X)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
    130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
    131     X = cast.to_dataframe(X)
--> 132     return pd.DataFrame(self._transform(X))

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X)
    127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
--> 128     return self.model.fit_transform(X)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:918, in IterativeErrorCorrection.fit_transform(self, X)
    915 @validate_arguments(config=dict(arbitrary_types_allowed=True))
    916 def fit_transform(self, X: pd.DataFrame) -> pd.DataFrame:
    917     # Run imputation
--> 918     X = self._setup(X)
    920     Xt_init = self._initial_imputation(X)
    921     Xt_init.columns = X.columns

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:703, in IterativeErrorCorrection._setup(self, X)
    700         existing_vals = X[col][X[col].notnull()]
    702         le = LabelEncoder()
--> 703         X.loc[X[col].notnull(), col] = le.fit_transform(existing_vals).astype(
    704             int
    705         )
    706         self.encoders[col] = le
    708 self.limits = {}

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:912, in _LocationIndexer.__setitem__(self, key, value)
    909 self._has_valid_setitem_indexer(key)
    911 iloc = self if self.name == "iloc" else self.obj.iloc
--> 912 iloc._setitem_with_indexer(indexer, value, self.name)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:1948, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
   1946     self._setitem_with_indexer_split_path(indexer, value, name)
   1947 else:
-> 1948     self._setitem_single_block(indexer, value, name)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:2211, in _iLocIndexer._setitem_single_block(self, indexer, value, name)
   2208 self.obj._check_is_chained_assignment_possible()
   2210 # actually do the set
-> 2211 self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
   2212 self.obj._maybe_update_cacher(clear=True, inplace=True)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:416, in BaseBlockManager.setitem(self, indexer, value, warn)
    412     # No need to split if we either set all columns or on a single block
    413     # manager
    414     self = self.copy()
--> 416 return self.apply("setitem", indexer=indexer, value=value)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:364, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
    362         applied = b.apply(f, **kwargs)
    363     else:
--> 364         applied = getattr(b, f)(**kwargs)
    365     result_blocks = extend_blocks(applied, result_blocks)
    367 out = type(self).from_blocks(result_blocks, self.axes)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\blocks.py:2056, in EABackedBlock.setitem(self, indexer, value, using_cow)
   2053 check_setitem_lengths(indexer, value, values)
   2055 try:
-> 2056     values[indexer] = value
   2057 except (ValueError, TypeError):
   2058     if isinstance(self.dtype, IntervalDtype):
   2059         # see TestSetitemFloatIntervalWithIntIntervalValues

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\_mixins.py:261, in NDArrayBackedExtensionArray.__setitem__(self, key, value)
    259 def __setitem__(self, key, value) -> None:
    260     key = check_array_indexer(self, key)
--> 261     value = self._validate_setitem_value(value)
    262     self._ndarray[key] = value

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:1587, in Categorical._validate_setitem_value(self, value)
   1584 def _validate_setitem_value(self, value):
   1585     if not is_hashable(value):
   1586         # wrap scalars and hashable-listlikes in list
-> 1587         return self._validate_listlike(value)
   1588     else:
   1589         return self._validate_scalar(value)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:2309, in Categorical._validate_listlike(self, value)
   2306 # no assignments of values not in categories, but it's always ok to set
   2307 # something to np.nan
   2308 if len(to_add) and not isna(to_add).all():
-> 2309     raise TypeError(
   2310         "Cannot setitem on a Categorical with a new "
   2311         "category, set the categories first"
   2312     )
   2314 codes = self.categories.get_indexer(value)
   2315 return codes.astype(self._ndarray.dtype, copy=False)

TypeError: Cannot setitem on a Categorical with a new category, set the categories first

When I combined these 2 columns as the new static part, the error is very similar as above:

 ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[35], line 3
      1 # Note no more missingness in static data.
----> 3 dataset = model.fit_transform(dataset)  # Or call fit() then transform().
      5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum())  # type: ignore
      7 dataset.static

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, **kwargs)
     53 """Fit the method to the data and transform it. Equivalent to calling ``fit`` and then ``transform``.
     54 
     55 Args:
   (...)
     61     dataset.BaseDataset: The transformed dataset.
     62 """
     63 self.fit(data, *args, **kwargs)
---> 64 return self.transform(data, *args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, **kwargs)
     31 """Transforms the given data.
     32 
     33 Args:
   (...)
     39     Any: The transformed data.
     40 """
     41 logger.debug(f"Calling _transform() implementation on {self.__class__.__name__}")
---> 42 transformed_data = self._transform(data, *args, **kwargs)
     44 return transformed_data

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs)
     80 if data.static is not None:
     81     static_data = data.static.dataframe()
---> 82     imputed_static_data = self.imputer.transform(static_data)
     83     imputed_static_data.columns = static_data.columns
     84     imputed_static_data.index = static_data.index

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
    130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
    131     X = cast.to_dataframe(X)
--> 132     return pd.DataFrame(self._transform(X))

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X)
     85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
---> 86     return self._model.transform(X)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
    130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
    131     X = cast.to_dataframe(X)
--> 132     return pd.DataFrame(self._transform(X))

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X)
    127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
--> 128     return self.model.fit_transform(X)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:918, in IterativeErrorCorrection.fit_transform(self, X)
    915 @validate_arguments(config=dict(arbitrary_types_allowed=True))
    916 def fit_transform(self, X: pd.DataFrame) -> pd.DataFrame:
    917     # Run imputation
--> 918     X = self._setup(X)
    920     Xt_init = self._initial_imputation(X)
    921     Xt_init.columns = X.columns

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
     53 @wraps(_func)
     54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55     return vd.call(*args, **kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
    148 def call(self, *args: Any, **kwargs: Any) -> Any:
    149     m = self.init_model_instance(*args, **kwargs)
--> 150     return self.execute(m)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
    220     return self.raw_function(*args_, **kwargs, **var_kwargs)
    221 else:
--> 222     return self.raw_function(**d, **var_kwargs)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:703, in IterativeErrorCorrection._setup(self, X)
    700         existing_vals = X[col][X[col].notnull()]
    702         le = LabelEncoder()
--> 703         X.loc[X[col].notnull(), col] = le.fit_transform(existing_vals).astype(
    704             int
    705         )
    706         self.encoders[col] = le
    708 self.limits = {}

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:912, in _LocationIndexer.__setitem__(self, key, value)
    909 self._has_valid_setitem_indexer(key)
    911 iloc = self if self.name == "iloc" else self.obj.iloc
--> 912 iloc._setitem_with_indexer(indexer, value, self.name)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:1946, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
   1943 # align and set the values
   1944 if take_split_path:
   1945     # We have to operate column-wise
-> 1946     self._setitem_with_indexer_split_path(indexer, value, name)
   1947 else:
   1948     self._setitem_single_block(indexer, value, name)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:1990, in _iLocIndexer._setitem_with_indexer_split_path(self, indexer, value, name)
   1986     self._setitem_with_indexer_2d_value(indexer, value)
   1988 elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
   1989     # We are setting multiple rows in a single column.
-> 1990     self._setitem_single_column(ilocs[0], value, pi)
   1992 elif len(ilocs) == 1 and 0 != lplane_indexer != len(value):
   1993     # We are trying to set N values into M entries of a single
   1994     #  column, which is invalid for N != M
   1995     # Exclude zero-len for e.g. boolean masking that is all-false
   1997     if len(value) == 1 and not is_integer(info_axis):
   1998         # This is a case like df.iloc[:3, [1]] = [0]
   1999         #  where we treat as df.iloc[:3, 1] = 0

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:2168, in _iLocIndexer._setitem_single_column(self, loc, value, plane_indexer)
   2164         self.obj.isetitem(loc, value)
   2165 else:
   2166     # set value into the column (first attempting to operate inplace, then
   2167     #  falling back to casting if necessary)
-> 2168     self.obj._mgr.column_setitem(loc, plane_indexer, value)
   2170 self.obj._clear_item_cache()

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:1338, in BlockManager.column_setitem(self, loc, idx, value, inplace_only)
   1336     col_mgr.setitem_inplace(idx, value)
   1337 else:
-> 1338     new_mgr = col_mgr.setitem((idx,), value)
   1339     self.iset(loc, new_mgr._block.values, inplace=True)
   1341 if needs_to_warn:

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:416, in BaseBlockManager.setitem(self, indexer, value, warn)
    412     # No need to split if we either set all columns or on a single block
    413     # manager
    414     self = self.copy()
--> 416 return self.apply("setitem", indexer=indexer, value=value)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:364, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
    362         applied = b.apply(f, **kwargs)
    363     else:
--> 364         applied = getattr(b, f)(**kwargs)
    365     result_blocks = extend_blocks(applied, result_blocks)
    367 out = type(self).from_blocks(result_blocks, self.axes)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\blocks.py:2056, in EABackedBlock.setitem(self, indexer, value, using_cow)
   2053 check_setitem_lengths(indexer, value, values)
   2055 try:
-> 2056     values[indexer] = value
   2057 except (ValueError, TypeError):
   2058     if isinstance(self.dtype, IntervalDtype):
   2059         # see TestSetitemFloatIntervalWithIntIntervalValues

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\_mixins.py:261, in NDArrayBackedExtensionArray.__setitem__(self, key, value)
    259 def __setitem__(self, key, value) -> None:
    260     key = check_array_indexer(self, key)
--> 261     value = self._validate_setitem_value(value)
    262     self._ndarray[key] = value

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:1587, in Categorical._validate_setitem_value(self, value)
   1584 def _validate_setitem_value(self, value):
   1585     if not is_hashable(value):
   1586         # wrap scalars and hashable-listlikes in list
-> 1587         return self._validate_listlike(value)
   1588     else:
   1589         return self._validate_scalar(value)

File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:2309, in Categorical._validate_listlike(self, value)
   2306 # no assignments of values not in categories, but it's always ok to set
   2307 # something to np.nan
   2308 if len(to_add) and not isna(to_add).all():
-> 2309     raise TypeError(
   2310         "Cannot setitem on a Categorical with a new "
   2311         "category, set the categories first"
   2312     )
   2314 codes = self.categories.get_indexer(value)
   2315 return codes.astype(self._ndarray.dtype, copy=False)

TypeError: Cannot setitem on a Categorical with a new category, set the categories first
EvanWu19 commented 8 months ago

any suggestion on how should I debug my code?

DrShushen commented 8 months ago

Not sure quite yet, but will look into this over the next week or so and hopefully have this solved!