TileDB-Inc / tiledbsoma-feedstock

A conda-smithy repository for tiledbsoma.
BSD 3-Clause "New" or "Revised" License
3 stars 3 forks source link

TileDB-SOMA 1.2.4 #24

Closed johnkerl closed 1 year ago

johnkerl commented 1 year ago

23, but post-tag.

Note we need to figure out the Python 3.10 solver issue already noted on #23.

Needs to be connected to #22.

johnkerl commented 1 year ago

I just merged in @jdblischak 's jdb/tiledbsoma-release-1.2.4 which also contains #22

jdblischak commented 1 year ago

As we had anticipated, the main blocker is the conda solver error for the Python 3.10 build. The builds for both R 4.1 and 4.2 pass, and so do the other Python builds

jdblischak commented 1 year ago

I copy-pasted the environment specs from one of the failed builds and tried to build locally. I can't use libtiledbsoma 1.2.4.* since that is newly built in the CI job and not available for download. Changing this to libtiledbsoma 1.2.*, mamba/conda was able to solve by installing libtiledbsoma 1.2.1. When I specifically force it to install libtiledbsoma 1.2.2.*, it fails with the following informative error message (I still don't know why these detailed error messages aren't output by conda mambabuild; I investigated last time we had this problem in #20).

Here is the error message:

Could not solve for environment specs
The following packages are incompatible
├─ libabseil 20230125.2* cxx17_h59595ed_2 is requested and can be installed;
├─ libtiledbsoma 1.2.2**  is installable and it requires
│  └─ tiledb >=2.15.1,<2.16.0a0  with the potential options
│     ├─ tiledb [2.15.1|2.15.2] would require
│     │  └─ openssl >=1.1.1t,<1.1.2a , which can be installed;
│     └─ tiledb [2.15.1|2.15.2] would require
│        └─ libgoogle-cloud [>=2.10.0,<2.10.1.0a0 |>=2.8.0,<2.8.1.0a0 |>=2.9.1,<2.9.2.0a0 ], which requires
│           └─ libabseil 20230125 cxx17*, which conflicts with any installable versions previously reported;
└─ openssl 3.1.0* hd590300_3 is uninstallable because it conflicts with any installable versions previously reported.

Here is the command to run locally on a linux-64 machine:

mamba env create --file soma-py10.yaml

Here are the contents of soma-py10.yaml:

name: soma-py10
channels:
  - conda-forge
  - tiledb
  - nodefaults
dependencies:
  - _libgcc_mutex 0.1.* conda_forge
  - ca-certificates 2023.5.7.* hbcca054_0
  - ld_impl_linux-64 2.40.* h41732ed_0
  - libgfortran5 12.2.0.* h337968e_19
  - libstdcxx-ng 12.2.0.* h46fd767_19
  - python_abi 3.10.* 3_cp310
  - tzdata 2023c h71feb2d_0
  - libgfortran-ng 12.2.0.* h69a702a_19
  - libgomp 12.2.0.* h65d4601_19
  - _openmp_mutex 4.5.* 2_gnu
  - libgcc-ng 12.2.0.* h65d4601_19
  - aws-c-common 0.8.19.* hd590300_0
  - bzip2 1.0.8.* h7f98852_4
  - c-ares 1.19.1.* hd590300_0
  - gflags 2.2.2.* he1b5a44_1004
  - keyutils 1.6.1.* h166bdaf_0
  - libabseil 20230125.2.* cxx17_h59595ed_2
  - libbrotlicommon 1.0.9.* h166bdaf_8
  - libcrc32c 1.1.2.* h9c3ff4c_0
  - libev 4.33.* h516909a_1
  - libffi 3.4.2.* h7f98852_5
  - libnsl 2.0.0.* h7f98852_0
  - libopenblas 0.3.21.* pthreads_h78a6416_3
  - libutf8proc 2.8.0.* h166bdaf_0
  - libuuid 2.38.1.* h0b41bf4_0
  - libzlib 1.2.13.* h166bdaf_4
  - lz4-c 1.9.4.* hcb278e6_0
  - ncurses 6.3.* h27087fc_1
  - openssl 3.1.0.* hd590300_3
  - re2 2023.03.02.* h8c504da_0
  - snappy 1.1.10.* h9fff704_0
  - xz 5.2.6.* h166bdaf_0
  - aws-c-cal 0.5.26.* hf677bf3_1
  - aws-c-compression 0.2.16.* hbad4bc6_7
  - aws-c-sdkutils 0.1.9.* hbad4bc6_2
  - aws-checksums 0.1.14.* hbad4bc6_7
  - glog 0.6.0.* h6f12383_0
  - libblas 3.9.0.* 16_linux64_openblas
  - libbrotlidec 1.0.9.* h166bdaf_8
  - libbrotlienc 1.0.9.* h166bdaf_8
  - libedit 3.1.20191231.* he28a2e2_2
  - libevent 2.1.12.* h3358134_0
  - libnghttp2 1.52.0.* h61bc06f_0
  - libprotobuf 3.21.12.* h3eb15da_0
  - libsqlite 3.42.0.* h2797004_0
  - libssh2 1.10.0.* hf14f497_3
  - readline 8.2.* h8228510_1
  - s2n 1.3.44.* h06160fa_0
  - tk 8.6.12.* h27826a3_0
  - zlib 1.2.13.* h166bdaf_4
  - zstd 1.5.2.* h3eb15da_6
  - aws-c-io 0.13.21.* h9fef7b8_5
  - krb5 1.20.1.* h81ceb04_0
  - libcblas 3.9.0.* 16_linux64_openblas
  - libgrpc 1.54.2.* hb20ce57_2
  - liblapack 3.9.0.* 16_linux64_openblas
  - libthrift 0.18.1.* h8fd135c_1
  - orc 1.8.3.* hfdbbad2_0
  - python 3.10.11.* he550d4f_0_cpython
  - aws-c-event-stream 0.2.20.* hb4b372c_7
  - aws-c-http 0.7.7.* h2632f9a_4
  - libcurl 8.1.1.* h409715c_0
  - numpy 1.21.6.* py310h45f3432_0
  - packaging 23.1.* pyhd8ed1ab_0
  - pybind11-global 2.10.4.* py310hdf3cbec_0
  - setuptools 67.7.2.* pyhd8ed1ab_0
  - tomli 2.0.1.* pyhd8ed1ab_0
  - typing_extensions 4.6.0.* pyha770c72_0
  - wheel 0.40.0.* pyhd8ed1ab_0
  - aws-c-auth 0.6.27.* he072965_1
  - aws-c-mqtt 0.8.11.* h2282364_1
  - libgoogle-cloud 2.10.1.* hac9eb74_1
  - pip 23.1.2.* pyhd8ed1ab_0
  - pybind11 2.10.4.* py310hdf3cbec_0
  - typing-extensions 4.6.0.* hd8ed1ab_0
  - aws-c-s3 0.3.0.* hcb5a9b2_2
  - setuptools-scm 7.1.0.* pyhd8ed1ab_0
  - aws-crt-cpp 0.20.1.* he0fdcb3_3
  - setuptools_scm 7.1.0.* hd8ed1ab_0
  - aws-sdk-cpp 1.10.57.* hb0b1f3a_12
  - arrow-cpp 9.0.0.* py310h4e03f60_35_cpu
  - parquet-cpp 1.5.1.* 2
  - pyarrow 9.0.0.* py310he6bfd7f_35_cpu
  - libtiledbsoma 1.2.2.*
jdblischak commented 1 year ago

I think the problem is once again libgoogle-cloud. Building tiledb for libgoogle-cloud 2.10.0 (https://github.com/conda-forge/tiledb-feedstock/pull/188) is how we fixed the previous conda solver error (https://github.com/TileDB-Inc/tiledbsoma-feedstock/pull/20#issuecomment-1533398566)

While I was investigating locally, the solver started working by installing the very recently uploaded tiledb 2.15.3

https://anaconda.org/conda-forge/tiledb/files?version=2.15.3 https://github.com/conda-forge/tiledb-feedstock/pull/189

During the rerendering for the 2.15.3 PR, it also bumped libgoogle-cloud from 2.10.0 to 2.10.1. Once the newly uploaded tiledb 2.15.3 is available across all the anaconda.org mirrors (I restarted an Azure job just now, and it still installed 2.15.2), I think the solver will work

Note that we could backport support for libgoogle-cloud 2.10.1 to tiledb 2.15.2 (by rerendering in a branch of tiledb-feedstock), I assume that all users would be fine with upgrading to 2.15.3, and thus this wouldn't be worth the extra effort

johnkerl commented 1 year ago

Note that we could backport support for libgoogle-cloud 2.10.1 to tiledb 2.15.2 (by rerendering in a branch of tiledb-feedstock), I assume that all users would be fine with upgrading to 2.15.3, and thus this wouldn't be worth the extra effort

Yes -- @Shelnutt2 and I are of the view that 2.15.3 is not "for SOMA" so users should be fine with 2.15.2 or 2.15.3, whichever we can best deliver

Shelnutt2 commented 1 year ago

Yes -- @Shelnutt2 and I are of the view that 2.15.3 is not "for SOMA" so users should be fine with 2.15.2 or 2.15.3, whichever we can best deliver

Well for conda we need to make sure this works with 2.15.3, else we will have conflicts for users. My comment was only in our discussion on if we wanted another SOMA release for 1.2.5 for the wheels.

jdblischak commented 1 year ago

I don't understand why the conda solver is such a moving target. Initially it couldn't solve the Python 3.10 env. Once tiledb 2.15.3 was released, it could build for Python 3.10 but it couldn't build the R environments (eg my Attempt #3 builds from a few days ago in the link below). Then today I restarted them (Attempt #4), and now the R envs build fine, but Python 3.8 and 3.9 are now failing.

https://dev.azure.com/jdblischak/feedstock-builds/_build/results?buildId=298&view=logs&j=47084397-e614-5090-05b8-c04ebacb88b9&t=5529f3d0-7b5a-44b6-b1ff-9443f4817692&s=b15d9194-8f26-5328-b47f-5968c76b37e7

jdblischak commented 1 year ago

Given that today the Python client is able to build (as it did in PR #25), one path forward would be to remove the R client output from the recipe for now. That would allow us to merge this PR and build binaries for libtiledbsoma and tiledbsoma-py 1.2.4. And then we could continue troubleshooting the solver issue for the R client. Thoughts?

And given that this is such a moving target, just in case I also restarted the jobs on my fork. Maybe it will be our lucky day and all the envs will solve, but I'm not getting my hopes up

https://dev.azure.com/jdblischak/feedstock-builds/_build/results?buildId=298&view=results

johnkerl commented 1 year ago

@jdblischak I can see reasons to go either way. Ultimately @Shelnutt2 's call to include/exclude R from this round.

jdblischak commented 1 year ago

So on my fork the Python env is still failing to build. I suspect that #25 passed only because no rerendering happened (so the C and C++ compiler versions weren't bumped). Maybe that's a clue. I have some cycles to dedicate to this problem today. I'll investigate and report back

jdblischak commented 1 year ago

Actually nevermind about dropping R from the recipe. As this is a constantly moving problem, today the R client builds fine. The problem is once again solving the env for the Python client.

The only differences in the env to be solved between this PR (failing) and #25 (passed) were 1) c_compiler_version and cxx_compiler_version, and 2) the tiledb pin.

I just pushed a commit to my fork to revert the changes to c_compiler_version and cxx_compiler_version. For both linux-64 and osx-64, the Python client was able to be built with Python 3.10 but then failed to solve the env for Python 3.9 (as expected, the R client built fine). So this doesn't explain why PR #25 passed.

This suggests the problem is the change in the tiledb pin. But looking at the passed build for #25, the Python clients solve the env by installing either tiledb 2.15.2 (3.8, 3.9) or 2.15.3 (3.7, 3.10). This appears to me to be completely consistent with the updated pin of >=2.15.2,<2.16, so that doesn't really explain it either.

Ah, I figured out why #25 succeeded. During the build phase, both Python 3.8 and Python 3.9 installed the already uploaded tiledb 1.2.2-h25619bb_0 from the tiledb channel because that satisfies the host dependency - libtiledbsoma {{ version }}. Then during the test stage, they are somehow able to install the local version, as required by the strict pinning in the run requirements - {{ pin_subpackage('libtiledbsoma', exact=True) }} (since it can install run but not host, maybe pybind11 is generating the conflict??). Anyways, the conda docs clearly state that jinja2 functions like pin_subpackage() are only supposed to be used in the run and test requirements. But the pyarrow recipe uses pin_subpackage() in the host requirements, so for practical purposes, we probably should as well.

jdblischak commented 1 year ago

Today I've been experimenting with the recipe for version 1.2.2. We know that this recipe can solve for Python 3.8 and 3.9 by installing libtiledbsoma 1.2.2 from the tiledb channel, which then installs tiledb 2.15.2. When I add {{ pin_subpackage('libtiledbsoma', exact=True) }} to the host requirements for 1.2.2, then the envs for Python 3.8 and 3.9 fail again because they are forced to install the locally built libtiledbsoma. Two critical lines of the error message below:

-   - package libtiledbsoma-1.2.2-h52a582c_2 requires tiledb >=2.15.3,<2.16.0a0, but none of the providers can be installed

conda_build.exceptions.DependencyNeedsBuildingError: Unsatisfiable dependencies for platform osx-64: {MatchSpec("libtiledbsoma==1.2.2=h52a582c_2"), MatchSpec("tiledb[version='>=2.15.3,<2.16.0a0']")}

So the question is, where is the requirement for tiledb >= 2.15.3 coming from? If tiledb 2.15.2 could be installed, then these environments would solve. The libtiledbsoma recipe itself for version 1.2.2 has the pin tiledb 2.15.*, so this restriction isn't explicitly imposed in this recipe. However, the tiledb recipe uses run_exports:

  run_exports:
    # https://abi-laboratory.pro/?view=timeline&l=tiledb
    - {{ pin_subpackage('tiledb', max_pin='x.x') }}

Combining run_exports with pin_subpackage is advanced, so there isn't a ton of documentation. If I understand the docs section Export runtime requirements correctly, this is where the '>=2.15.3,<2.16.0a0' requirement is coming from.

Here is my current understanding of the situation:

So potential solutions to this issue include:

My thoughts on these potential solutions are:

jdblischak commented 1 year ago
  • Build libtiledbsoma against both tiledb 2.15.2 and 2.15.3, so that then each tiledb-py build can install whichever version they need
  • Current favorite (with the caveat that I still need to test it)

Well this isn't an option (at least not a straightforward one). Since tiledb is a dependency of libtiledbsoma, and python is a dependency of tiledbsoma-py, this ends up trying to build for all 4 Python versions against libtiledbsoma built against either 2.15.2 or 2.15.3.

It was informative though. It confirmed that the solver error is caused by the strict run_exports from which tiledb version was used to build libtiledbsoma. Also, interestingly, for once Python 3.7 is the easy case. It's the only Python version that can be co-installed with either tiledb 2.15.2 or 2.15.3.

At this point I'm leaning towards my suggestion of breaking the close link between libtiledbsoma and its clients. I appreciate the conceptual appeal (especially for maintenance and troubleshooting) of requiring a single Python/R client to be used with each release of libtiledbsoma. But does each release of libtiledbsoma really break the Python/R API? I assume not. From a software dependency and release cadence perspective, I think it'd be easier to only release new versions of the Python and/R clients as needed.

johnkerl commented 1 year ago

@Shelnutt2 , thoughts on the above?

jdblischak commented 1 year ago

I pushed an empty commit to my fork to test the effect of bumping the libgoogle-cloud pin for tiledb to 2.11.0 (https://github.com/conda-forge/tiledb-feedstock/pull/192). It failed to solve for Python 3.9, which was the first Python version it tried to build for. This is consistent with my above observation:

Python 3.8 and 3.9 need to install tiledb 2.15.2 and Python 3.7 and 3.10 need to install tiledb 2.15.3

and thus I don't think the libgoogle-cloud bump had a noticeable effect

jdblischak commented 1 year ago

@Shelnutt2 Could you review? Please pay special attention to the separate about: sections that I added to each of the separate outputs. This way the C++ library, Python client, and R client will all have different summaries and descriptions on anaconda.org

johnkerl commented 1 year ago

CI is green, @Shelnutt2 and @jdblischak both approve, abouts-wordings look good -- OK to squash and merge -- ?

johnkerl commented 1 year ago

@Shelnutt2 I'll take your PR accept and @jdblischak 's accept as confirmation ... if you have after-concerns we can revert.

jdblischak commented 1 year ago

For future reference: