conda-forge / dagster-feedstock

A conda-smithy repository for dagster.
BSD 3-Clause "New" or "Revised" License
1 stars 8 forks source link

PyPI and conda-forge versions for many dagster packages do not match #301

Closed zaneselvans closed 1 year ago

zaneselvans commented 1 year ago

Solution to issue cannot be found in the documentation.

Issue

As I mentioned in this discussion, it appears that the immature library packages (currently v0.21.6) are getting assigned the mature package version (currently v1.5.6) somehow in the conda-forge packaging, which leads to a big version discrepancy between the PyPI and conda-forge versions for the same package, e.g. dagster-postgres. Full text of the above discussion:

We are using conda-lock to create a reproducible environment that we run Dagster in. It reads dependencies and version constraints from pyproject.toml, solves the environment using mamba, and produces a complete environment specification in a lockfile. We're in the process of switching to using Postgres instead of SQLite for the EventLogs, since SQLite has been locking up on us due to concurrency. However, when trying to add the new dependency I discovered that the PyPI and conda-forge versioning for the dagster-postgres package seem to be completely unrelated to each other.

Is it supposed to be this way? Typically conda packages track the same versions as their PyPI counterparts, and dagster-postgres seems to be an official package. It looks like the 0.21.6 version is probably the "libs" version, while 1.5.5 is the overall Dagster version.

Having these two versioning schemes be out of sync makes it difficult to specify the environment in a straightforward way, because the tools parsing pyproject.toml could ultimately seek the package either from PyPI or conda-forge.

I notice that this seems to be a common, but not universal, situation, across many of the dagster-* packages. Searching conda-forge packages it looks like everything has a most recent version of 1.5.5:

mamba search 'dagster-*==1.5.5'

Loading channels: done
# Name                       Version           Build  Channel
dagster-airbyte                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-airflow                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-aws                    1.5.5    pyhd8ed1ab_0  conda-forge
dagster-celery                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-celery-docker           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-celery-k8s             1.5.5    pyhd8ed1ab_0  conda-forge
dagster-census                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-dask                   1.5.5    pyhd8ed1ab_0  conda-forge
dagster-datadog                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-dbt                    1.5.5    pyhd8ed1ab_0  conda-forge
dagster-docker                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-duckdb                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-duckdb-pandas           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-duckdb-polars           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-duckdb-pyspark           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-fivetran               1.5.5    pyhd8ed1ab_0  conda-forge
dagster-gcp                    1.5.5    pyhd8ed1ab_0  conda-forge
dagster-gcp-pandas             1.5.5    pyhd8ed1ab_0  conda-forge
dagster-gcp-pyspark            1.5.5    pyhd8ed1ab_0  conda-forge
dagster-ge                     1.5.5    pyhd8ed1ab_0  conda-forge
dagster-github                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-graphql                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-k8s                    1.5.5    pyhd8ed1ab_0  conda-forge
dagster-managed-elements           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-mlflow                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-msteams                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-mysql                  1.5.5    pyhd8ed1ab_0  conda-forge
dagster-pagerduty              1.5.5    pyhd8ed1ab_0  conda-forge
dagster-pandas                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-pandera                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-papertrail             1.5.5    pyhd8ed1ab_0  conda-forge
dagster-pipes                  1.5.5    pyhd8ed1ab_0  conda-forge
dagster-postgres               1.5.5    pyhd8ed1ab_0  conda-forge
dagster-prometheus             1.5.5    pyhd8ed1ab_0  conda-forge
dagster-pyspark                1.5.5    pyhd8ed1ab_0  conda-forge
dagster-shell                  1.5.5    pyhd8ed1ab_0  conda-forge
dagster-slack                  1.5.5    pyhd8ed1ab_0  conda-forge
dagster-snowflake              1.5.5    pyhd8ed1ab_0  conda-forge
dagster-snowflake-pandas           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-snowflake-pyspark           1.5.5    pyhd8ed1ab_0  conda-forge
dagster-spark                  1.5.5    pyhd8ed1ab_0  conda-forge
dagster-ssh                    1.5.5    pyhd8ed1ab_0  conda-forge
dagster-twilio                 1.5.5    pyhd8ed1ab_0  conda-forge
dagster-wandb                  1.5.5    pyhd8ed1ab_0  conda-forge
dagster-webserver              1.5.5    pyhd8ed1ab_0  conda-forge

While many but not all of the same dagster-* packages have the libs version 0.21.6 on PyPI.

dagster-airbyte: 0.21.6
dagster-dbt: 0.21.6
dagster-mysql: 0.21.6
dagster-papertrail: 0.21.6
dagster-spark: 0.21.6

dagster-airflow: 1.5.6
dagster-graphql: 1.5.6
dagster-pipes 1.5.6
dagster-webserver 1.5.6

Looking at the conda-forge/dagster-feedstock which produces these packages, it seems like the intention is for the immature library packages to have the immature library version lib_version (currently 0.21.6), but for some reason that's not working:

  - name: dagster-postgres
    build:
      noarch: python
      version: {{ lib_version }}
      script: cd dagster-postgres && {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation

    requirements:
      host:
        - pip
        - python >=3.8
      run:
        - {{ pin_subpackage("dagster", max_pin="x.x.x") }}
        - psycopg2-binary
        - python >=3.8

    test:
      imports:
        - dagster_postgres
      commands:
        - pip check
      requires:
        - python >=3.8
        - pip

    about:
      home: https://github.com/dagster-io/dagster/tree/master/python_modules/libraries/dagster-postgres
      license: Apache-2.0
      license_family: APACHE
      license_file: dagster-postgres/LICENSE
      summary: A Dagster integration for postgres

It also seems like the intention in the documentation about library versioning

Looking at the conda build docs on multiple outputs it does seem like the versions key is supposed to allow each of the subpackages to have its own version, but for some reason that's clearly not happening.

Installed packages

# packages in environment at /Users/zane/miniforge3:
#
# Name                    Version                   Build  Channel
annotated-types           0.6.0              pyhd8ed1ab_0    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
boltons                   23.0.0             pyhd8ed1ab_0    conda-forge
brotli-python             1.1.0           py310h1253130_0    conda-forge
bzip2                     1.0.8                h3422bc3_4    conda-forge
c-ares                    1.19.1               hb547adb_0    conda-forge
ca-certificates           2023.7.22            hf0a4a13_0    conda-forge
cachecontrol              0.13.1             pyhd8ed1ab_0    conda-forge
cachecontrol-with-filecache 0.13.1             pyhd8ed1ab_0    conda-forge
cachy                     0.3.0              pyhd8ed1ab_1    conda-forge
certifi                   2023.7.22          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310h2399d43_3    conda-forge
charset-normalizer        3.2.0              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
click-default-group       1.2.4              pyhd8ed1ab_0    conda-forge
clikit                    0.6.2              pyhd8ed1ab_2    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
conda                     23.9.0          py310hbe9552e_0    conda-forge
conda-libmamba-solver     23.9.2             pyhd8ed1ab_0    conda-forge
conda-lock                2.4.2              pyhd8ed1ab_0    conda-forge
conda-package-handling    2.2.0              pyh38be061_0    conda-forge
conda-package-streaming   0.9.0              pyhd8ed1ab_0    conda-forge
crashtest                 0.4.1              pyhd8ed1ab_0    conda-forge
cryptography              41.0.3          py310hdd3b5e7_0    conda-forge
distlib                   0.3.7              pyhd8ed1ab_0    conda-forge
ensureconda               1.4.3              pyhd8ed1ab_0    conda-forge
filelock                  3.13.1             pyhd8ed1ab_0    conda-forge
fmt                       10.1.1               h1995070_0    conda-forge
gitdb                     4.0.11             pyhd8ed1ab_0    conda-forge
gitpython                 3.1.40             pyhd8ed1ab_0    conda-forge
html5lib                  1.1                pyh9f0ad1d_0    conda-forge
icu                       73.2                 hc8870d7_0    conda-forge
idna                      3.4                pyhd8ed1ab_0    conda-forge
importlib-metadata        6.8.0              pyha770c72_0    conda-forge
importlib_metadata        6.8.0                hd8ed1ab_0    conda-forge
jaraco.classes            3.3.0              pyhd8ed1ab_0    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
jsonpatch                 1.32               pyhd8ed1ab_0    conda-forge
jsonpointer               2.0                        py_0    conda-forge
keyring                   24.2.0          py310hbe9552e_1    conda-forge
krb5                      1.21.2               h92f50d5_0    conda-forge
libarchive                3.7.2                h82b9b87_0    conda-forge
libcurl                   8.4.0                h2d989ff_0    conda-forge
libcxx                    16.0.6               h4653b0c_0    conda-forge
libedit                   3.1.20191231         hc8eb9b7_2    conda-forge
libev                     4.33                 h642e427_1    conda-forge
libffi                    3.4.2                h3422bc3_5    conda-forge
libiconv                  1.17                 he4db4b2_0    conda-forge
libmamba                  1.5.3                h0a6dc31_1    conda-forge
libmambapy                1.5.3           py310h3812fd7_1    conda-forge
libnghttp2                1.52.0               hae82a92_0    conda-forge
libsolv                   0.7.24               ha614eb4_3    conda-forge
libsqlite                 3.43.0               hb31c410_0    conda-forge
libssh2                   1.11.0               h7a5bd25_0    conda-forge
libxml2                   2.11.5               h25269f3_1    conda-forge
libzlib                   1.2.13               h53f4e23_5    conda-forge
lz4-c                     1.9.4                hb7217d7_0    conda-forge
lzo                       2.10              h642e427_1000    conda-forge
mamba                     1.5.3           py310ha5d4528_1    conda-forge
markupsafe                2.1.3           py310h2aa6e3c_1    conda-forge
more-itertools            10.1.0             pyhd8ed1ab_0    conda-forge
msgpack-python            1.0.6           py310h38f39d4_0    conda-forge
ncurses                   6.4                  h7ea286d_0    conda-forge
openssl                   3.1.4                h0d3ecfb_0    conda-forge
packaging                 23.1               pyhd8ed1ab_0    conda-forge
pastel                    0.2.1              pyhd8ed1ab_0    conda-forge
pip                       23.2.1             pyhd8ed1ab_0    conda-forge
pkginfo                   1.9.6              pyhd8ed1ab_0    conda-forge
platformdirs              3.11.0             pyhd8ed1ab_0    conda-forge
pluggy                    1.3.0              pyhd8ed1ab_0    conda-forge
pybind11-abi              4                    hd8ed1ab_3    conda-forge
pycosat                   0.6.4           py310h8e9501a_1    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pydantic                  2.4.2              pyhd8ed1ab_1    conda-forge
pydantic-core             2.10.1          py310had9acf8_0    conda-forge
pylev                     1.4.0              pyhd8ed1ab_0    conda-forge
pyopenssl                 23.2.0             pyhd8ed1ab_1    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.10.12         h01493a6_0_cpython    conda-forge
python_abi                3.10                    3_cp310    conda-forge
pyyaml                    6.0.1           py310h2aa6e3c_1    conda-forge
readline                  8.2                  h92ec313_1    conda-forge
reproc                    14.2.4               h1a8c8d9_0    conda-forge
reproc-cpp                14.2.4               hb7217d7_0    conda-forge
requests                  2.31.0             pyhd8ed1ab_0    conda-forge
ruamel.yaml               0.17.32         py310h2aa6e3c_0    conda-forge
ruamel.yaml.clib          0.2.7           py310h8e9501a_1    conda-forge
setuptools                68.1.2             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smmap                     5.0.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.12               he1e0b03_0    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tomlkit                   0.12.2             pyha770c72_0    conda-forge
toolz                     0.12.0             pyhd8ed1ab_0    conda-forge
tqdm                      4.66.1             pyhd8ed1ab_0    conda-forge
truststore                0.8.0              pyhd8ed1ab_0    conda-forge
typing-extensions         4.8.0                hd8ed1ab_0    conda-forge
typing_extensions         4.8.0              pyha770c72_0    conda-forge
tzdata                    2023c                h71feb2d_0    conda-forge
urllib3                   1.26.18            pyhd8ed1ab_0    conda-forge
virtualenv                20.24.6            pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
wheel                     0.41.2             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h57fd34a_0    conda-forge
yaml                      0.2.5                h3422bc3_2    conda-forge
yaml-cpp                  0.8.0                h13dd4ca_0    conda-forge
zipp                      3.17.0             pyhd8ed1ab_0    conda-forge
zstandard                 0.19.0          py310h8e9501a_0    conda-forge
zstd                      1.5.5                h4f39d0f_0    conda-forge

Environment info

active environment : base
    active env location : /Users/zane/miniforge3
            shell level : 1
       user config file : /Users/zane/.condarc
 populated config files : /Users/zane/miniforge3/.condarc
                          /Users/zane/.condarc
          conda version : 23.9.0
    conda-build version : not installed
         python version : 3.10.12.final.0
       virtual packages : __archspec=1=arm64
                          __osx=13.6=0
                          __unix=0=0
       base environment : /Users/zane/miniforge3  (writable)
      conda av data dir : /Users/zane/miniforge3/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/osx-arm64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/osx-arm64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/osx-arm64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /Users/zane/miniforge3/pkgs
                          /Users/zane/.conda/pkgs
       envs directories : /Users/zane/miniforge3/envs
                          /Users/zane/.conda/envs
               platform : osx-arm64
             user-agent : conda/23.9.0 requests/2.31.0 CPython/3.10.12 Darwin/22.6.0 OSX/13.6
                UID:GID : 1000:20
             netrc file : None
           offline mode : False
zaneselvans commented 1 year ago

@bollwyvl I'm not familiar with the mechanics of building multiple subpackages in a single recipe, but it seems like maybe something has gone a bit funny here with the mature vs. immature library version numbers that are associated with the conda packages.

bollwyvl commented 1 year ago

Yep, somewhere along the line (ca 0.15.8), the multi-outputs scheme got too clever for itself, and started shipping incorrect, but internally-consistent versions, even though it was able to continuously pull the correct upstream version tarballs, and users that pinned dagster 1.y.z were still getting dagster-* ?.y.z were still getting functional solves.

Unless this is some upstream conda-forge bug, probably the only way to fully confidently fix this, and still get some level of automation (critical at this scale, with interrelated pins, etc), is to split this feedstock into two, still multi-outputs feedstocks:

But: as the previous million downloaders haven't complained until now, we're not going to go back and mark all of those broken, and will have to deal with actual conflicts later down the line, e.g. when 1.0.2 rolls around. Hopefully, that will be never and all the versions will at least be monotonically increasing, such that there is never another new upstream dagster-*-1.0.2.tar.gz.

bollwyvl commented 1 year ago

Kicking tires here: https://github.com/conda-forge/staged-recipes/pull/24424

bollwyvl commented 1 year ago

All the linked PRs have been merged, closing.

If something new is broken/inconsistent, let's start on a new issue.

it's... unpredictable what will happen while both (1.5|0.21).6 can both solve, may have to see how x.y.7 works out.

zaneselvans commented 1 year ago

Thank you so much for fixing all this so quickly! I'll keep an eye on our lockfiles and see how it goes.