BinPro / CONCOCT

Clustering cONtigs with COverage and ComposiTion
Other
119 stars 48 forks source link

Error with scikit-learn 1.2 : TypeError: Feature names #321

Open nickp60 opened 1 year ago

nickp60 commented 1 year ago

Not sure if this has been noted before, but I tried to install CONCOCT and it pulled scikit-learn 1.2. When running, i got the following error:

Traceback (most recent call last):
  File "/opt/conda/bin/concoct", line 92, in <module>
    results = main(args)
  File "/opt/conda/bin/concoct", line 39, in main
    transform_filter, pca = perform_pca(
  File "/opt/conda/lib/python3.10/site-packages/concoct/transform.py", line 5, in perform_pca
    pca_object = PCA(n_components=nc, random_state=seed).fit(d)
  File "/opt/conda/lib/python3.10/site-packages/sklearn/decomposition/_pca.py", line 435, in fit
    self._fit(X)
  File "/opt/conda/lib/python3.10/site-packages/sklearn/decomposition/_pca.py", line 485, in _fit
    X = self._validate_data(
  File "/opt/conda/lib/python3.10/site-packages/sklearn/base.py", line 518, in _validate_data
    self._check_feature_names(X, reset=reset)
  File "/opt/conda/lib/python3.10/site-packages/sklearn/base.py", line 385, in _check_feature_names
    feature_names_in = _get_feature_names(X)
  File "/opt/conda/lib/python3.10/site-packages/sklearn/utils/validation.py", line 1893, in _get_feature_names
    raise TypeError(
TypeError: Feature names are only supported if all input features have string names, but your input has ['int', 'str'] as feature name / column name types. If you want feature names to be stored and validated, you must convert them all to strings, by using X.columns = X.columns.astype(str) for example. Otherwise you can remove feature / column names from your input data, or convert them all to a non-string data type.

This appears to be changed from a warning in version 1.2.

Replacing scikit-learn 1.2 with scikit-learn 1.1 resolved the issue for me.

tanaes commented 1 year ago

Thanks for this, I just encountered the problem as well.

tamburinif commented 1 year ago

I'm running into this issue as well. I tried downgrading scikit-learn to 1.1 but this gives a warning:

OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.
nickp60 commented 1 year ago

I haven't tested it fully but I think I resolved this? The dockerfile is here, the key I think is installing libopenblas with conda for a specific build created using OpenMP: https://github.com/vdblab/dockerfiles/blob/main/metawrap/Dockerfile

fallinwind commented 1 year ago

same

vinisalazar commented 1 year ago

I tried both of these fixes (downgrading sklearn + using the libopenblas openmp build) and they worked. Perhaps it would be worth it to pin the version of scikit-learn as scikit-learn<=1.1,>=0.14.1 on the requirements file?

https://github.com/BinPro/CONCOCT/blob/823dcd670bc42f6ea4622881c2484cda6c253a76/requirements.txt#L11

fallinwind commented 1 year ago

I tried both of these fixes (downgrading sklearn + using the libopenblas openmp build) and they worked. Perhaps it would be worth it to pin the version of scikit-learn as scikit-learn<=1.1,>=0.14.1 on the requirements file?

https://github.com/BinPro/CONCOCT/blob/823dcd670bc42f6ea4622881c2484cda6c253a76/requirements.txt#L11

Hi Vini @vinisalazar , may I ask how to use the libopenblas openmp build ? anticipating your reply

vinisalazar commented 1 year ago

Hi @fallinwind, I'm installing concoct by creating a conda environment from a yaml file, here's how it looks like:

channels:
  - conda-forge
  - bioconda
dependencies:
  - concoct=1.1.0
  - libopenblas=*=openmp*
  - mkl
  - python>=3
  - samtools>=1.9
  - scikit-learn=1.1.*
variables:
  USE_OPENMP: 1

You can create the environment from such a file with the conda env create command.

Best, V

marsfro commented 1 year ago

I solved it by installing scikit-learn inside concoct_env conda install -c intel scikit-learn

99qaz commented 11 months ago

我也遇到了这个问题。我尝试将 scikit-learn 降级到 1.1,但会发出警告:

OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.

Hello, I have encountered the same issue. Have you managed to resolve this problem?