ContinuumIO / anaconda-issues

Anaconda issue tracking
648 stars 224 forks source link

Gensim-4.3.* incorrect package definition #13384

Open filip-komarzyniec opened 6 months ago

filip-komarzyniec commented 6 months ago

Checklist

Impacted product

What happened?

According to the changelog on the official GitHub profile of gensim library, FuzzyTM dependency has been removed as it was unused.

In the anaconda built gensim it's somehow still in use. This produces many problems when e.g. recreating a conda environment with gensim already installed and then pip installing there a project also dependent on gensim.

Such situation results in pip detecting already satisfied gensim, but somehow broken, as it's missing FuzzyTM and all its dependencies..

In order to reproduce the issue, run the following steps: 1) create a new environment with gensim

conda create --name issue_demo 'python=3.11' 'pip' 'gensim=4.3'

2) pip install gensim there pip install 'gensim==4.3.*'

3) Notice that it tries installing FuzzyTM:

pip install 'gensim==4.3.*'
Requirement already satisfied: gensim==4.3.* in /root/miniconda/envs/issue_demo/lib/python3.11/site-packages (4.3.0)
Requirement already satisfied: numpy>=1.18.5 in /root/miniconda/envs/issue_demo/lib/python3.11/site-packages (from gensim==4.3.*) (1.26.0)
Requirement already satisfied: scipy>=1.7.0 in /root/miniconda/envs/issue_demo/lib/python3.11/site-packages (from gensim==4.3.*) (1.11.3)
Requirement already satisfied: smart-open>=1.8.1 in /root/miniconda/envs/issue_demo/lib/python3.11/site-packages (from gensim==4.3.*) (5.2.1)
Collecting FuzzyTM>=0.4.0 (from gensim==4.3.*)
  Downloading FuzzyTM-2.0.9-py3-none-any.whl.metadata (7.9 kB)
Collecting pandas (from FuzzyTM>=0.4.0->gensim==4.3.*)
  Downloading pandas-2.2.2.tar.gz (4.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 14.3 MB/s eta 0:00:00
  Installing build dependencies ... error
  error: subprocess-exited-with-error

My command exited with error, as I was working on ppc-64le platform which was not capable of building pandas from source. It should not have happened as gensim should be correctly, fully, installed from conda indexes.

Expected behavior or outcome

Pip installing a project dependent on gensim in above described environment should finish correctly.

Conda info

active environment : issue_demo
    active env location : /root/miniconda/envs/issue_demo
            shell level : 15
       user config file : /root/.condarc
 populated config files : /root/.condarc
          conda version : 23.10.0
    conda-build version : not installed
         python version : 3.11.5.final.0
       virtual packages : __archspec=1=power8le
                          __glibc=2.28=0
                          __linux=4.18.0=0
                          __unix=0=0
       base environment : /root/miniconda  (writable)
      conda av data dir : /root/miniconda/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-ppc64le
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-ppc64le
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /root/miniconda/pkgs
                          /root/.conda/pkgs
       envs directories : /root/miniconda/envs
                          /root/.conda/envs
               platform : linux-ppc64le
             user-agent : conda/23.10.0 requests/2.31.0 CPython/3.11.5 Linux/4.18.0-477.27.1.el8_8.ppc64le rhel/8.8 glibc/2.28 solver/libmamba conda-libmamba-solver/23.11.1 libmambapy/1.5.3
                UID:GID : 0:0
             netrc file : None
           offline mode : False

Conda config

==> /root/.condarc <==
changeps1: False
always_yes: True

Conda list

only default channels

Additional information

Although I'm posting outputs from ppc-64le platform I have reproduced everything on x86 architecture and the results are the same --- gensim still requires FuzzyTM library.