Cauchemare / Light_FAMD

Light version Factor analysis for Mixed Data
BSD 2-Clause "Simplified" License
13 stars 5 forks source link

TypeError: SparseDataFrame() takes no arguments #4

Closed tylernwatson closed 2 years ago

tylernwatson commented 4 years ago

I tried to use FAMD with my own mixed type data set, and I received the above error. To verify I wasn't doing something wrong, I tried again with the code provided in the documentation.

X_n = pd.DataFrame(data=np.random.randint(0,100,size=(10,2)),columns=list('AB'))
X_c =pd.DataFrame(np.random.choice(list('abcde'),size=(10,4),replace=True),columns =list('CDEF'))
test=pd.concat([X_n,X_c],axis=1)

famd = FAMD(n_components=2)
famd.fit(test)

This is giving me the following error:

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/one_hot.py:25: FutureWarning:

The SparseDataFrame class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-84-fb02d3ed7fbe> in <module>
      1 famd = FAMD(n_components=2)
----> 2 famd.fit(test)

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/famd.py in fit(self, X, y)
     43                              "you only have numerical data; you should consider using PCA")
     44 
---> 45         return super().fit(X)
     46 
     47     def fit_transform(self,X,y=None):

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/mfa.py in fit(self, X, y)
     66                     engine=self.engine
     67                 )
---> 68             self.partial_factor_analysis_[name] = fa.fit(X.loc[:, cols])
     69 
     70         # Fit the global PCA

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/mca.py in fit(self, X, y)
     20         self.one_hot_ = one_hot.OneHotEncoder().fit(X)
     21 
---> 22         _X_t=  self.one_hot_.transform(X)
     23 
     24         _0_freq_serie= (_X_t == 0).sum(axis=0)/ len(_X_t)

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/one_hot.py in transform(self, X)
     27             columns=self.column_names_,
     28             index=X.index if isinstance(X, pd.DataFrame) else None,
---> 29             default_fill_value=0
     30         )

TypeError: SparseDataFrame() takes no arguments

I am currently using Pandas 1.1.1. When I try the above code with Pandas 0.25.3, I instead get this error:

MCA PROCESS ELIMINATED 0  COLUMNS SINCE THEIR MISS_RATES >= 99%
/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/one_hot.py:29: FutureWarning:

SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/sparse/frame.py:257: FutureWarning:

SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/frame.py:3471: FutureWarning:

SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/ops/__init__.py:1641: FutureWarning:

SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/sparse/frame.py:339: FutureWarning:

SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/generic.py:6289: FutureWarning:

SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/generic.py:5884: FutureWarning:

SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/pandas/core/sparse/frame.py:785: FutureWarning:

SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-84-fb02d3ed7fbe> in <module>
      1 famd = FAMD(n_components=2)
----> 2 famd.fit(test)

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/famd.py in fit(self, X, y)
     43                              "you only have numerical data; you should consider using PCA")
     44 
---> 45         return super().fit(X)
     46 
     47     def fit_transform(self,X,y=None):

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/mfa.py in fit(self, X, y)
     66                     engine=self.engine
     67                 )
---> 68             self.partial_factor_analysis_[name] = fa.fit(X.loc[:, cols])
     69 
     70         # Fit the global PCA

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/mca.py in fit(self, X, y)
     30         self.total_inertia_ = (n_new_columns - n_initial_columns) / n_initial_columns
     31         # Apply CA to the indicator matrix
---> 32         super().fit(_X_t.loc[:,self._usecols])
     33 
     34         return self

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/ca.py in fit(self, X, y)
     25         # Check input
     26         if self.check_input:
---> 27             utils.check_array(X)
     28 
     29         # Check all values are positive

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     71                           FutureWarning)
     72         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 73         return f(**kwargs)
     74     return inner_f
     75 

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    577                                       dtype=dtype, copy=copy,
    578                                       force_all_finite=force_all_finite,
--> 579                                       accept_large_sparse=accept_large_sparse)
    580     else:
    581         # If np.array(..) gives ComplexWarning, then we convert the warning

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/sklearn/utils/validation.py in _ensure_sparse_format(spmatrix, accept_sparse, dtype, copy, force_all_finite, accept_large_sparse)
    352 
    353     if accept_sparse is False:
--> 354         raise TypeError('A sparse matrix was passed, but dense '
    355                         'data is required. Use X.toarray() to '
    356                         'convert to a dense numpy array.')

TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

Adding test = test.to_numpy() before fitting the famd instance gets a new error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-86-fb02d3ed7fbe> in <module>
      1 famd = FAMD(n_components=2)
----> 2 famd.fit(test)

/anaconda3/envs/store_embeddings/lib/python3.7/site-packages/light_famd/famd.py in fit(self, X, y)
     35             self.groups['Numerical'] = num_cols
     36         else:
---> 37             raise ValueError("FAMD works with categorical and numerical data but " +
     38                              "you only have categorical data; you should consider using MCA")
     39         if len(cat_cols):

ValueError: FAMD works with categorical and numerical data but you only have categorical data; you should consider using MCA

I'd appreciate any guidance you can provide on how to resolve this issue.

logisticregress commented 4 years ago

having the same issue. It would be great if there could be a solution to this; encountering the same issue with Prince.

dreamiter commented 3 years ago

In the new version of Pandas, SparseDataFrame() is no longer available which is documented Here Can someone address this please? Thanks.

Cauchemare commented 2 years ago

we had fixed this problem,you could try again after updating your light_famd package