I'm using the newly created synthetic data of the fully joined table to patch things on an original table and the resulting dataframe is patch_v1. For simplicity, let's say that everything is patched and patch_v1 is equal to the synthetic data
Ideally, I want to be able to keep multiple columns frozen whilst predicting values for multiple other columns, but for simplicity let's say I want to predict 1 column and so I use the target_col argument of model.predict().
File C:\A3\envs\k2view\lib\site-packages\realtabformer\data_utils.py:538, in process_data(df, numeric_max_len, numeric_precision, numeric_nparts, first_col_type, col_transform_data, target_col)
523 processed_df = pd.concat(
524 [
525 processed_df,
(...)
533 axis=1,
534 )
536 # Get the different sets of column types
537 cat_cols = processed_df.columns[
--> 538 processed_df.columns.str.contains(ColDataType.CATEGORICAL)
539 ]
540 numeric_cols = processed_df.columns[
541 ~processed_df.columns.str.contains(ColDataType.CATEGORICAL)
542 ]
544 if first_col_type == ColDataType.CATEGORICAL:
File C:\A3\envs\k2view\lib\site-packages\pandas\core\accessor.py:224, in CachedAccessor.get(self, obj, cls)
221 if obj is None:
222 # we're accessing the attribute of the class, i.e., Dataset.geo
223 return self._accessor
--> 224 accessor_obj = self._accessor(obj)
225 # Replace the property with the accessor object. Inspired by:
226 # https://www.pydanny.com/cached-property.html
227 # We need to use object.setattr because we overwrite setattr on
228 # NDFrame
229 object.setattr(obj, self._name, accessor_obj)
File C:\A3\envs\k2view\lib\site-packages\pandas\core\strings\accessor.py:245, in StringMethods._validate(data)
242 inferred_dtype = lib.infer_dtype(values, skipna=True)
244 if inferred_dtype not in allowed_types:
--> 245 raise AttributeError("Can only use .str accessor with string values!")
246 return inferred_dtype
AttributeError: Can only use .str accessor with string values!
What would be the correct way, given a trained model and a dataframe with identical schema to the training data, to predict 1 (or more) column(s) (simultaneously)?
I'm using Jupter Notebook 6.5.7, with REaLTabFormer 0.1.7, pandas 2.2.2, numpy 1.26.3 on Windows 11 23H2
I have a table which is the result of fully joining a SQL schema.
I'm training a tabular model and generate some synthetic data
I'm using the newly created synthetic data of the fully joined table to patch things on an original table and the resulting dataframe is
patch_v1
. For simplicity, let's say that everything is patched and patch_v1 is equal to the synthetic dataIdeally, I want to be able to keep multiple columns frozen whilst predicting values for multiple other columns, but for simplicity let's say I want to predict 1 column and so I use the
target_col
argument ofmodel.predict()
.I'm getting
AttributeError: Can only use .str accessor with string values!
from process_data in realtabformer.data_utils.What would be the correct way, given a trained model and a dataframe with identical schema to the training data, to predict 1 (or more) column(s) (simultaneously)?