Open EvanWu19 opened 9 months ago
Thank you for reporting this issue! We will look into fixing this.
In the first instance, could you please try installing temporai in a conda environment with python 3.10 (rather than 3.11, which we haven't tested fully yet)?
https://conda.io/projects/conda/en/latest/user-guide/install/windows.html
then:
conda create -n temporai-env python=3.10 -y
conda activate temporai-env
pip install temporai
And let us know if the problem still happens.
Thank you for your feedback. Now, I have used conda environment with python 3.10, and reran my code. The issue is still existing. I tried to figure out what happened by shrinking the static part of the data frame to one column. IF the column is numeric:
missing_in_rows = dataset.static.dataframe().isnull().any(axis=1)
has_missing_values = missing_in_rows.any()
num_rows_with_missing = missing_in_rows.sum()
rows_with_missing = dataset.static.dataframe()[missing_in_rows]
print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
print("Missing value rows:", rows_with_missing)
dataset.static
Missing value count: 16
Missing value rows: MedianIncomePerACS
sample_idx
706 NaN
794 NaN
756 NaN
764 NaN
740 NaN
774 NaN
666 NaN
778 NaN
714 NaN
772 NaN
728 NaN
788 NaN
772 NaN
798 NaN
718 NaN
728 NaN
StaticSamples with data:
sample_idx MedianIncomePerACS 714 101563.0 722 51122.0 762 50220.0 714 72724.0 772 54835.0 ... ... 726 163170.0 734 75256.0 736 48750.0 710 99697.0 704 73377.0
828 rows × 1 columns
model = plugin_loader.get("preprocessing.imputation.static.static_tabular_imputer", static_imputer="HyperImpute")
dataset = model.fit_transform(dataset) # Or call fit() then transform().
print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
dataset.static
I got following error:
2024-01-23 13:30:12 | INFO | hyperimpute.logger:log_and_print:65 | > HyperImpute using inner optimization
2024-01-23 13:30:12 | INFO | hyperimpute.logger:log_and_print:65 | > Imputation iter 0
2024-01-23 13:30:12 | ERROR | hyperimpute.logger:log_and_print:65 | >>> MedianIncomePerACS:linear_regression:2 folds eval failed at least one array or dtype is required
2024-01-23 13:30:12 | ERROR | hyperimpute.logger:log_and_print:65 | >>> MedianIncomePerACS:linear_regression:1 folds eval failed at least one array or dtype is required
2024-01-23 13:30:12 | INFO | hyperimpute.logger:log_and_print:65 | >>> Column MedianIncomePerACS <-- score -9999999 <-- Model linear_regression
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[18], line 3
1 # Note no more missingness in static data.
----> 3 dataset = model.fit_transform(dataset) # Or call fit() then transform().
5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
7 dataset.static
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, **kwargs)
53 """Fit the method to the data and transform it. Equivalent to calling ``fit`` and then ``transform``.
54
55 Args:
(...)
61 dataset.BaseDataset: The transformed dataset.
62 """
63 self.fit(data, *args, **kwargs)
---> 64 return self.transform(data, *args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, **kwargs)
31 """Transforms the given data.
32
33 Args:
(...)
39 Any: The transformed data.
40 """
41 logger.debug(f"Calling _transform() implementation on {self.__class__.__name__}")
---> 42 transformed_data = self._transform(data, *args, **kwargs)
44 return transformed_data
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs)
80 if data.static is not None:
81 static_data = data.static.dataframe()
---> 82 imputed_static_data = self.imputer.transform(static_data)
83 imputed_static_data.columns = static_data.columns
84 imputed_static_data.index = static_data.index
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
131 X = cast.to_dataframe(X)
--> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X)
85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
---> 86 return self._model.transform(X)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
131 X = cast.to_dataframe(X)
--> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X)
127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
--> 128 return self.model.fit_transform(X)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:925, in IterativeErrorCorrection.fit_transform(self, X)
921 Xt_init.columns = X.columns
923 Xt = Xt_init.copy()
--> 925 Xt = self._fit_transform_inner_optimization(Xt)
927 return self._tear_down(Xt)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:894, in IterativeErrorCorrection._fit_transform_inner_optimization(self, X)
892 for col in cols:
893 obj_score += self._optimize_model_for_column(X, col)
--> 894 X = self._impute_single_column(X.copy(), col, True)
896 inf_norm = np.linalg.norm(X - X_prev, ord=np.inf, axis=None)
897 if inf_norm < INNER_TOL and it > self.n_min_inner_iter:
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:833, in IterativeErrorCorrection._impute_single_column(self, X, col, train)
830 est = self.column_to_model[col]
832 if train:
--> 833 est.fit(X_train, y_train)
835 X[col][self.mask[col]] = est.predict(covs[self.mask[col]]).values.squeeze()
837 col_min, col_max = self.limits[col]
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\prediction\regression\base.py:49, in RegressionPlugin.fit(self, X, *args, **kwargs)
46 raise ValueError("Invalid input for fit. Expecting X and Y.")
48 X = cast.to_dataframe(X)
---> 49 self._fit(X, *args, **kwargs)
51 return self
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\prediction\regression\plugin_linear_regression.py:59, in LinearRegressionPlugin._fit(self, X, *args, **kwargs)
56 def _fit(
57 self, X: pd.DataFrame, *args: Any, **kwargs: Any
58 ) -> "LinearRegressionPlugin":
---> 59 self.model.fit(X, *args, **kwargs)
60 return self
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\base.py:1351, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
1344 estimator._validate_params()
1346 with config_context(
1347 skip_parameter_validation=(
1348 prefer_skip_nested_validation or global_skip_validation
1349 )
1350 ):
-> 1351 return fit_method(estimator, *args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\linear_model\_ridge.py:1153, in Ridge.fit(self, X, y, sample_weight)
1133 """Fit Ridge regression model.
1134
1135 Parameters
(...)
1150 Fitted estimator.
1151 """
1152 _accept_sparse = _get_valid_accept_sparse(sparse.issparse(X), self.solver)
-> 1153 X, y = self._validate_data(
1154 X,
1155 y,
1156 accept_sparse=_accept_sparse,
1157 dtype=[np.float64, np.float32],
1158 multi_output=True,
1159 y_numeric=True,
1160 )
1161 return super().fit(X, y, sample_weight=sample_weight)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\base.py:650, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
648 y = check_array(y, input_name="y", **check_y_params)
649 else:
--> 650 X, y = check_X_y(X, y, **check_params)
651 out = X, y
653 if not no_val_X and check_params.get("ensure_2d", True):
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\utils\validation.py:1192, in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
1187 estimator_name = _check_estimator_name(estimator)
1188 raise ValueError(
1189 f"{estimator_name} requires y to be passed, but the target y is None"
1190 )
-> 1192 X = check_array(
1193 X,
1194 accept_sparse=accept_sparse,
1195 accept_large_sparse=accept_large_sparse,
1196 dtype=dtype,
1197 order=order,
1198 copy=copy,
1199 force_all_finite=force_all_finite,
1200 ensure_2d=ensure_2d,
1201 allow_nd=allow_nd,
1202 ensure_min_samples=ensure_min_samples,
1203 ensure_min_features=ensure_min_features,
1204 estimator=estimator,
1205 input_name="X",
1206 )
1208 y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric, estimator=estimator)
1210 check_consistent_length(X, y)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\sklearn\utils\validation.py:833, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
829 pandas_requires_conversion = any(
830 _pandas_dtype_needs_early_conversion(i) for i in dtypes_orig
831 )
832 if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig):
--> 833 dtype_orig = np.result_type(*dtypes_orig)
834 elif pandas_requires_conversion and any(d == object for d in dtypes_orig):
835 # Force object if any of the dtypes is an object
836 dtype_orig = object
ValueError: at least one array or dtype is required
Missing value count: 0 Missing value rows: Empty DataFrame Columns: [A_Patient_dx] Index: []
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[26], line 3
1 # Note no more missingness in static data.
----> 3 dataset = model.fit_transform(dataset) # Or call fit() then transform().
5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
7 dataset.static
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, **kwargs)
53 """Fit the method to the data and transform it. Equivalent to calling ``fit`` and then ``transform``.
54
55 Args:
(...)
61 dataset.BaseDataset: The transformed dataset.
62 """
63 self.fit(data, *args, **kwargs)
---> 64 return self.transform(data, *args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, **kwargs)
31 """Transforms the given data.
32
33 Args:
(...)
39 Any: The transformed data.
40 """
41 logger.debug(f"Calling _transform() implementation on {self.__class__.__name__}")
---> 42 transformed_data = self._transform(data, *args, **kwargs)
44 return transformed_data
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs)
80 if data.static is not None:
81 static_data = data.static.dataframe()
---> 82 imputed_static_data = self.imputer.transform(static_data)
83 imputed_static_data.columns = static_data.columns
84 imputed_static_data.index = static_data.index
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
131 X = cast.to_dataframe(X)
--> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X)
85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
---> 86 return self._model.transform(X)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
131 X = cast.to_dataframe(X)
--> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X)
127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
--> 128 return self.model.fit_transform(X)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:918, in IterativeErrorCorrection.fit_transform(self, X)
915 @validate_arguments(config=dict(arbitrary_types_allowed=True))
916 def fit_transform(self, X: pd.DataFrame) -> pd.DataFrame:
917 # Run imputation
--> 918 X = self._setup(X)
920 Xt_init = self._initial_imputation(X)
921 Xt_init.columns = X.columns
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:703, in IterativeErrorCorrection._setup(self, X)
700 existing_vals = X[col][X[col].notnull()]
702 le = LabelEncoder()
--> 703 X.loc[X[col].notnull(), col] = le.fit_transform(existing_vals).astype(
704 int
705 )
706 self.encoders[col] = le
708 self.limits = {}
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:912, in _LocationIndexer.__setitem__(self, key, value)
909 self._has_valid_setitem_indexer(key)
911 iloc = self if self.name == "iloc" else self.obj.iloc
--> 912 iloc._setitem_with_indexer(indexer, value, self.name)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:1948, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
1946 self._setitem_with_indexer_split_path(indexer, value, name)
1947 else:
-> 1948 self._setitem_single_block(indexer, value, name)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:2211, in _iLocIndexer._setitem_single_block(self, indexer, value, name)
2208 self.obj._check_is_chained_assignment_possible()
2210 # actually do the set
-> 2211 self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
2212 self.obj._maybe_update_cacher(clear=True, inplace=True)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:416, in BaseBlockManager.setitem(self, indexer, value, warn)
412 # No need to split if we either set all columns or on a single block
413 # manager
414 self = self.copy()
--> 416 return self.apply("setitem", indexer=indexer, value=value)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:364, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
362 applied = b.apply(f, **kwargs)
363 else:
--> 364 applied = getattr(b, f)(**kwargs)
365 result_blocks = extend_blocks(applied, result_blocks)
367 out = type(self).from_blocks(result_blocks, self.axes)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\blocks.py:2056, in EABackedBlock.setitem(self, indexer, value, using_cow)
2053 check_setitem_lengths(indexer, value, values)
2055 try:
-> 2056 values[indexer] = value
2057 except (ValueError, TypeError):
2058 if isinstance(self.dtype, IntervalDtype):
2059 # see TestSetitemFloatIntervalWithIntIntervalValues
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\_mixins.py:261, in NDArrayBackedExtensionArray.__setitem__(self, key, value)
259 def __setitem__(self, key, value) -> None:
260 key = check_array_indexer(self, key)
--> 261 value = self._validate_setitem_value(value)
262 self._ndarray[key] = value
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:1587, in Categorical._validate_setitem_value(self, value)
1584 def _validate_setitem_value(self, value):
1585 if not is_hashable(value):
1586 # wrap scalars and hashable-listlikes in list
-> 1587 return self._validate_listlike(value)
1588 else:
1589 return self._validate_scalar(value)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:2309, in Categorical._validate_listlike(self, value)
2306 # no assignments of values not in categories, but it's always ok to set
2307 # something to np.nan
2308 if len(to_add) and not isna(to_add).all():
-> 2309 raise TypeError(
2310 "Cannot setitem on a Categorical with a new "
2311 "category, set the categories first"
2312 )
2314 codes = self.categories.get_indexer(value)
2315 return codes.astype(self._ndarray.dtype, copy=False)
TypeError: Cannot setitem on a Categorical with a new category, set the categories first
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[35], line 3
1 # Note no more missingness in static data.
----> 3 dataset = model.fit_transform(dataset) # Or call fit() then transform().
5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
7 dataset.static
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, **kwargs)
53 """Fit the method to the data and transform it. Equivalent to calling ``fit`` and then ``transform``.
54
55 Args:
(...)
61 dataset.BaseDataset: The transformed dataset.
62 """
63 self.fit(data, *args, **kwargs)
---> 64 return self.transform(data, *args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\core\_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, **kwargs)
31 """Transforms the given data.
32
33 Args:
(...)
39 Any: The transformed data.
40 """
41 logger.debug(f"Calling _transform() implementation on {self.__class__.__name__}")
---> 42 transformed_data = self._transform(data, *args, **kwargs)
44 return transformed_data
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs)
80 if data.static is not None:
81 static_data = data.static.dataframe()
---> 82 imputed_static_data = self.imputer.transform(static_data)
83 imputed_static_data.columns = static_data.columns
84 imputed_static_data.index = static_data.index
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
131 X = cast.to_dataframe(X)
--> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X)
85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
---> 86 return self._model.transform(X)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X)
130 def transform(self, X: pd.DataFrame) -> pd.DataFrame:
131 X = cast.to_dataframe(X)
--> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X)
127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame:
--> 128 return self.model.fit_transform(X)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:918, in IterativeErrorCorrection.fit_transform(self, X)
915 @validate_arguments(config=dict(arbitrary_types_allowed=True))
916 def fit_transform(self, X: pd.DataFrame) -> pd.DataFrame:
917 # Run imputation
--> 918 X = self._setup(X)
920 Xt_init = self._initial_imputation(X)
921 Xt_init.columns = X.columns
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments.<locals>.validate.<locals>.wrapper_function(*args, **kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, **kwargs: Any) -> Any:
---> 55 return vd.call(*args, **kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, **kwargs)
148 def call(self, *args: Any, **kwargs: Any) -> Any:
149 m = self.init_model_instance(*args, **kwargs)
--> 150 return self.execute(m)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m)
220 return self.raw_function(*args_, **kwargs, **var_kwargs)
221 else:
--> 222 return self.raw_function(**d, **var_kwargs)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\hyperimpute\plugins\imputers\_hyperimpute_internals.py:703, in IterativeErrorCorrection._setup(self, X)
700 existing_vals = X[col][X[col].notnull()]
702 le = LabelEncoder()
--> 703 X.loc[X[col].notnull(), col] = le.fit_transform(existing_vals).astype(
704 int
705 )
706 self.encoders[col] = le
708 self.limits = {}
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:912, in _LocationIndexer.__setitem__(self, key, value)
909 self._has_valid_setitem_indexer(key)
911 iloc = self if self.name == "iloc" else self.obj.iloc
--> 912 iloc._setitem_with_indexer(indexer, value, self.name)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:1946, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
1943 # align and set the values
1944 if take_split_path:
1945 # We have to operate column-wise
-> 1946 self._setitem_with_indexer_split_path(indexer, value, name)
1947 else:
1948 self._setitem_single_block(indexer, value, name)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:1990, in _iLocIndexer._setitem_with_indexer_split_path(self, indexer, value, name)
1986 self._setitem_with_indexer_2d_value(indexer, value)
1988 elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
1989 # We are setting multiple rows in a single column.
-> 1990 self._setitem_single_column(ilocs[0], value, pi)
1992 elif len(ilocs) == 1 and 0 != lplane_indexer != len(value):
1993 # We are trying to set N values into M entries of a single
1994 # column, which is invalid for N != M
1995 # Exclude zero-len for e.g. boolean masking that is all-false
1997 if len(value) == 1 and not is_integer(info_axis):
1998 # This is a case like df.iloc[:3, [1]] = [0]
1999 # where we treat as df.iloc[:3, 1] = 0
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\indexing.py:2168, in _iLocIndexer._setitem_single_column(self, loc, value, plane_indexer)
2164 self.obj.isetitem(loc, value)
2165 else:
2166 # set value into the column (first attempting to operate inplace, then
2167 # falling back to casting if necessary)
-> 2168 self.obj._mgr.column_setitem(loc, plane_indexer, value)
2170 self.obj._clear_item_cache()
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:1338, in BlockManager.column_setitem(self, loc, idx, value, inplace_only)
1336 col_mgr.setitem_inplace(idx, value)
1337 else:
-> 1338 new_mgr = col_mgr.setitem((idx,), value)
1339 self.iset(loc, new_mgr._block.values, inplace=True)
1341 if needs_to_warn:
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:416, in BaseBlockManager.setitem(self, indexer, value, warn)
412 # No need to split if we either set all columns or on a single block
413 # manager
414 self = self.copy()
--> 416 return self.apply("setitem", indexer=indexer, value=value)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\managers.py:364, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
362 applied = b.apply(f, **kwargs)
363 else:
--> 364 applied = getattr(b, f)(**kwargs)
365 result_blocks = extend_blocks(applied, result_blocks)
367 out = type(self).from_blocks(result_blocks, self.axes)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\internals\blocks.py:2056, in EABackedBlock.setitem(self, indexer, value, using_cow)
2053 check_setitem_lengths(indexer, value, values)
2055 try:
-> 2056 values[indexer] = value
2057 except (ValueError, TypeError):
2058 if isinstance(self.dtype, IntervalDtype):
2059 # see TestSetitemFloatIntervalWithIntIntervalValues
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\_mixins.py:261, in NDArrayBackedExtensionArray.__setitem__(self, key, value)
259 def __setitem__(self, key, value) -> None:
260 key = check_array_indexer(self, key)
--> 261 value = self._validate_setitem_value(value)
262 self._ndarray[key] = value
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:1587, in Categorical._validate_setitem_value(self, value)
1584 def _validate_setitem_value(self, value):
1585 if not is_hashable(value):
1586 # wrap scalars and hashable-listlikes in list
-> 1587 return self._validate_listlike(value)
1588 else:
1589 return self._validate_scalar(value)
File ~\AppData\Local\miniconda3\envs\temporai-env\lib\site-packages\pandas\core\arrays\categorical.py:2309, in Categorical._validate_listlike(self, value)
2306 # no assignments of values not in categories, but it's always ok to set
2307 # something to np.nan
2308 if len(to_add) and not isna(to_add).all():
-> 2309 raise TypeError(
2310 "Cannot setitem on a Categorical with a new "
2311 "category, set the categories first"
2312 )
2314 codes = self.categories.get_indexer(value)
2315 return codes.astype(self._ndarray.dtype, copy=False)
TypeError: Cannot setitem on a Categorical with a new category, set the categories first
any suggestion on how should I debug my code?
Not sure quite yet, but will look into this over the next week or so and hopefully have this solved!
Describe the bug I created a TemporalPredictionDataset according to the tutorial. However, when I tried to do the static data imputation, there are always error reported. I tried different static_imputer: "mean", "MissForest", and"HypterImput, but they all gave me the same error message. I followed your imputation tutorial, with following code:
from tempor import plugin_loader
dataset = my_datasource(with_missing=True, random_state=42).load() print(dataset)
model = plugin_loader.get("preprocessing.imputation.static.static_tabular_imputer", static_imputer="mean") print(model)
Note missingness in static data.
print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
dataset.static
Note no more missingness in static data.
dataset = model.fit_transform(dataset) # Or call fit() then transform().
print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore
dataset.static
TypeError Traceback (most recent call last) Cell In[59], line 3 1 # Note no more missingness in static data. ----> 3 dataset = model.fit_transform(dataset) # Or call fit() then transform(). 5 print("Missing value count:", dataset.static.dataframe().isnull().sum().sum()) # type: ignore 7 dataset.static
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments..validate..wrapper_function(*args, kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, *kwargs: Any) -> Any:
---> 55 return vd.call(args, kwargs)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, kwargs) 148 def call(self, *args: Any, *kwargs: Any) -> Any: 149 m = self.init_model_instance(args, kwargs) --> 150 return self.execute(m)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m) 220 return self.rawfunction(*args, kwargs, var_kwargs) 221 else: --> 222 return self.raw_function(d, var_kwargs)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tempor\methods\core_base_transformer.py:64, in BaseTransformer.fit_transform(self, data, *args, kwargs) 53 """Fit the method to the data and transform it. Equivalent to calling
fit
and thentransform
. 54 55 Args: (...) 61 dataset.BaseDataset: The transformed dataset. 62 """ 63 self.fit(data, *args, *kwargs) ---> 64 return self.transform(data, args, kwargs)File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tempor\methods\core_base_transformer.py:42, in BaseTransformer.transform(self, data, *args, *kwargs) 31 """Transforms the given data. 32 33 Args: (...) 39 Any: The transformed data. 40 """ 41 logger.debug(f"Calling _transform() implementation on {self.class.name}") ---> 42 transformed_data = self._transform(data, args, **kwargs) 44 return transformed_data
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tempor\methods\preprocessing\imputation\static\plugin_static_tabular_imputer.py:82, in StaticTabularImputer._transform(self, data, *args, **kwargs) 80 if data.static is not None: 81 static_data = data.static.dataframe() ---> 82 imputed_static_data = self.imputer.transform(static_data) 83 imputed_static_data.columns = static_data.columns 84 imputed_static_data.index = static_data.index
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X) 130 def transform(self, X: pd.DataFrame) -> pd.DataFrame: 131 X = cast.to_dataframe(X) --> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers\plugin_ice.py:86, in IterativeChainedEquationsPlugin._transform(self, X) 85 def _transform(self, X: pd.DataFrame) -> pd.DataFrame: ---> 86 return self._model.transform(X)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\core\base_plugin.py:132, in Plugin.transform(self, X) 130 def transform(self, X: pd.DataFrame) -> pd.DataFrame: 131 X = cast.to_dataframe(X) --> 132 return pd.DataFrame(self._transform(X))
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers\plugin_hyperimpute.py:128, in HyperImputePlugin._transform(self, X) 127 def _transform(self, X: pd.DataFrame) -> pd.DataFrame: --> 128 return self.model.fit_transform(X)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments..validate..wrapper_function(*args, kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, *kwargs: Any) -> Any:
---> 55 return vd.call(args, kwargs)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, kwargs) 148 def call(self, *args: Any, *kwargs: Any) -> Any: 149 m = self.init_model_instance(args, kwargs) --> 150 return self.execute(m)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m) 220 return self.rawfunction(*args, kwargs, var_kwargs) 221 else: --> 222 return self.raw_function(d, var_kwargs)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers_hyperimpute_internals.py:918, in IterativeErrorCorrection.fit_transform(self, X) 915 @validate_arguments(config=dict(arbitrary_types_allowed=True)) 916 def fit_transform(self, X: pd.DataFrame) -> pd.DataFrame: 917 # Run imputation --> 918 X = self._setup(X) 920 Xt_init = self._initial_imputation(X) 921 Xt_init.columns = X.columns
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:55, in validate_arguments..validate..wrapper_function(*args, kwargs)
53 @wraps(_func)
54 def wrapper_function(*args: Any, *kwargs: Any) -> Any:
---> 55 return vd.call(args, kwargs)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:150, in ValidatedFunction.call(self, *args, kwargs) 148 def call(self, *args: Any, *kwargs: Any) -> Any: 149 m = self.init_model_instance(args, kwargs) --> 150 return self.execute(m)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\deprecated\decorator.py:222, in ValidatedFunction.execute(self, m) 220 return self.rawfunction(*args, kwargs, var_kwargs) 221 else: --> 222 return self.raw_function(d, var_kwargs)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\hyperimpute\plugins\imputers_hyperimpute_internals.py:703, in IterativeErrorCorrection._setup(self, X) 700 existing_vals = X[col][X[col].notnull()] 702 le = LabelEncoder() --> 703 X.loc[X[col].notnull(), col] = le.fit_transform(existing_vals).astype( 704 int 705 ) 706 self.encoders[col] = le 708 self.limits = {}
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:885, in _LocationIndexer.setitem(self, key, value) 882 self._has_valid_setitem_indexer(key) 884 iloc = self if self.name == "iloc" else self.obj.iloc --> 885 iloc._setitem_with_indexer(indexer, value, self.name)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1893, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name) 1890 # align and set the values 1891 if take_split_path: 1892 # We have to operate column-wise -> 1893 self._setitem_with_indexer_split_path(indexer, value, name) 1894 else: 1895 self._setitem_single_block(indexer, value, name)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1937, in _iLocIndexer._setitem_with_indexer_split_path(self, indexer, value, name) 1933 self._setitem_with_indexer_2d_value(indexer, value) 1935 elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi): 1936 # We are setting multiple rows in a single column. -> 1937 self._setitem_single_column(ilocs[0], value, pi) 1939 elif len(ilocs) == 1 and 0 != lplane_indexer != len(value): 1940 # We are trying to set N values into M entries of a single 1941 # column, which is invalid for N != M 1942 # Exclude zero-len for e.g. boolean masking that is all-false 1944 if len(value) == 1 and not is_integer(info_axis): 1945 # This is a case like df.iloc[:3, [1]] = [0] 1946 # where we treat as df.iloc[:3, 1] = 0
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:2095, in _iLocIndexer._setitem_single_column(self, loc, value, plane_indexer) 2091 self.obj.isetitem(loc, value) 2092 else: 2093 # set value into the column (first attempting to operate inplace, then 2094 # falling back to casting if necessary) -> 2095 self.obj._mgr.column_setitem(loc, plane_indexer, value) 2097 self.obj._clear_item_cache()
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\managers.py:1308, in BlockManager.column_setitem(self, loc, idx, value, inplace_only) 1306 col_mgr.setitem_inplace(idx, value) 1307 else: -> 1308 new_mgr = col_mgr.setitem((idx,), value) 1309 self.iset(loc, new_mgr._block.values, inplace=True)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\managers.py:399, in BaseBlockManager.setitem(self, indexer, value) 395 # No need to split if we either set all columns or on a single block 396 # manager 397 self = self.copy() --> 399 return self.apply("setitem", indexer=indexer, value=value)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\managers.py:354, in BaseBlockManager.apply(self, f, align_keys, kwargs) 352 applied = b.apply(f, kwargs) 353 else: --> 354 applied = getattr(b, f)(**kwargs) 355 result_blocks = extend_blocks(applied, result_blocks) 357 out = type(self).from_blocks(result_blocks, self.axes)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\internals\blocks.py:1758, in EABackedBlock.setitem(self, indexer, value, using_cow) 1755 check_setitem_lengths(indexer, value, values) 1757 try: -> 1758 values[indexer] = value 1759 except (ValueError, TypeError) as err: 1760 _catch_deprecated_value_error(err)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\arrays_mixins.py:253, in NDArrayBackedExtensionArray.setitem(self, key, value) 251 def setitem(self, key, value) -> None: 252 key = check_array_indexer(self, key) --> 253 value = self._validate_setitem_value(value) 254 self._ndarray[key] = value
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\arrays\categorical.py:1560, in Categorical._validate_setitem_value(self, value) 1557 def _validate_setitem_value(self, value): 1558 if not is_hashable(value): 1559 # wrap scalars and hashable-listlikes in list -> 1560 return self._validate_listlike(value) 1561 else: 1562 return self._validate_scalar(value)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\arrays\categorical.py:2277, in Categorical._validate_listlike(self, value) 2274 # no assignments of values not in categories, but it's always ok to set 2275 # something to np.nan 2276 if len(to_add) and not isna(to_add).all(): -> 2277 raise TypeError( 2278 "Cannot setitem on a Categorical with a new " 2279 "category, set the categories first" 2280 ) 2282 codes = self.categories.get_indexer(value) 2283 return codes.astype(self._ndarray.dtype, copy=False)
TypeError: Cannot setitem on a Categorical with a new category, set the categories first
Desktop (please complete the following information):
Please help me understand how to solve this. I appreciate your hard word on developing such great tool. I hope I can use it in my work.