stan-dev / httpstan

HTTP interface to Stan, a package for Bayesian inference.
ISC License
39 stars 15 forks source link

Give stan libraries higher priority than system libraries #601

Closed aseyboldt closed 1 year ago

aseyboldt commented 2 years ago

When building the stan service module, the linker can pick up system libraries for tbb, because the library include directory is added to the end of the linker command line instead of to the front. Here I modify the linker command line in distutils to give the stan libraries the highest priority.

An alternative approach would be to add the default distutils library paths to the makefile, so that the stan_service object file uses the system libraries as well.

This might be the issue that makes builds fail on conda-forge: https://github.com/conda-forge/httpstan-feedstock/issues/1

Note: I only tested this on linux

riddell-stan commented 2 years ago

Thanks. This looks interesting. Would be great if it fixed the conda problem.

I might need a bit of time to review this. I need to refresh my memory of how all this stuff works.

riddell-stan commented 2 years ago

@aseyboldt Is there a minimal test case you can devise that shows how the existing version is problematic? Could one, say, create a container image with some old debian / ubuntu that has a system TBB that causes problems?

aseyboldt commented 2 years ago

I don't use docker much, I hope this does too (not as reproducible I'm afraid...): To be sure I'd run this in bash, not some other shell, because conda doesn't always set all env vars in other shells (in my experience anyway)

conda create -n httpstan-lib-order python poetry tbb sundials tbb tbb-devel
conda activate httpstan-lib-order
git clone https://github.com/stan-dev/httpstan httpstan_tmp
cd httpstan_tmp
make -j4
python -m poetry build
pip install dist/*.whl
rm -rf ~/.cache/httpstan/
python -c "import httpstan.models, asyncio; asyncio.run(httpstan.models.build_services_extension_module('data {}'))"
ldd ~/.cache/httpstan/4.7.2/models/*/*.so

This should give an output like this:

(httpstan-lib-order) [adr@adr-desktop httpstan_tmp] ldd ~/.cache/httpstan/4.7.2/models/*/*.so
        linux-vdso.so.1 (0x00007ffd5e4c4000)
        libsundials_cvodes.so.6 => /home/adr/miniconda3/envs/httpstan-lib-order/lib/libsundials_cvodes.so.6 (0x00007f84d0ec0000)
        libsundials_idas.so.5 => /home/adr/miniconda3/envs/httpstan-lib-order/lib/libsundials_idas.so.5 (0x00007f84d0e5f000)
        libsundials_nvecserial.so.6 => /home/adr/miniconda3/envs/httpstan-lib-order/lib/libsundials_nvecserial.so.6 (0x00007f84d0e42000)
        libtbb.so.12 => /home/adr/miniconda3/envs/httpstan-lib-order/lib/libtbb.so.12 (0x00007f84d0df2000)
        libstdc++.so.6 => /home/adr/miniconda3/envs/httpstan-lib-order/lib/libstdc++.so.6 (0x00007f84d0c3e000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007f84d0b19000)
        libgcc_s.so.1 => /home/adr/miniconda3/envs/httpstan-lib-order/lib/libgcc_s.so.1 (0x00007f84d0b00000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f84d08f4000)
        /usr/lib64/ld-linux-x86-64.so.2 (0x00007f84d0fcb000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f84d08ef000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f84d08ea000)

Which means we are loading the (incompatible) sundials libraries, and the conda libtbb. But since the stan_service.o file is compiled against libtbb from httpstan, this will fail to load due to missing symbols from tbb.

riddell-stan commented 2 years ago

Thanks. I can work with this.

aseyboldt commented 2 years ago

@riddell-stan I experimented a bit with the conda package, and I think I got it working (and messy for now) on linux and osx, by using the conda dependencies instead of bundled boost, eigen etc... Windows is still a big mess though. https://github.com/conda-forge/httpstan-feedstock/pull/14

ahartikainen commented 2 years ago

I know there are now unix-sockets on Windows (or what was it we used) but is it already in python? (There was some python issue/pep/their discussion)

edit.

https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/

https://bugs.python.org/issue33408

riddell-stan commented 2 years ago

@aseyboldt I'd recommend fixing conda for mac and linux first before trying anything involving windows. As @ahartikainen mentions, httpstan uses unix sockets for communication between processes.

If windows does indeed support unix sockets we probably want to get it working here first before trying to get things working with conda.

aseyboldt commented 2 years ago

I created a new PR for the conda package. This changes a bit how httpstan is build in conda:

The reason I'm mentioning this here, is that this also influences how we need to build the model shared libraries later. If we distribute using pip, we need to link to the libraries in the httpstan tree, but if we distribute with conda, we need to link to the conda libs. So the order of linker flags needs to be different.

Maybe the best way to accomplish this would be to write necessary compiler flags to a file when we build the package (ie in build.py for pip or in build.sh for conda), and load that file in a function like httpstan._get_compile_flags(), which we then use in httpstan.models?

riddell-stan commented 2 years ago

@aseyboldt FYI: in addition to building those shared libraries, the Makefile does build httpstan/stan_services.o.

riddell-stan commented 2 years ago

@aseyboldt What operating system are you using? I'm trying to reproduce your example in https://github.com/stan-dev/httpstan/pull/601#issuecomment-1159038213 but am failing. I install everything using miniconda but get this when I run ldd:

/home/lra/.cache/httpstan/4.7.2/models/msuyi3yt/stan_services_model_msuyi3yt.cpython-310-x86_64-linux-gnu.so:
    linux-vdso.so.1 (0x00007ffcc3fe2000)
    libstdc++.so.6 => /home/lra/miniconda3/envs/httpstan-lib-order2/lib/libstdc++.so.6 (0x00007fd683a3f000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd683945000)
    libgcc_s.so.1 => /home/lra/miniconda3/envs/httpstan-lib-order2/lib/libgcc_s.so.1 (0x00007fd68392b000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd683703000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fd683cfb000)
aseyboldt commented 2 years ago

@riddell-stan I'm also building the httpstan/stan_services.o file in the build.sh in the conda package PR.

Strange, that you got a different result... Just to make sure, this is without the patch in the PR? The latest release or some other commit? I'm running this on arch linux. conda sets some env variables differently depending on the shell. I think I executed it in bash, but I'm not sure, I can retry tomorrow.

riddell-stan commented 2 years ago

@aseyboldt Thanks. I don't mind trying this on Arch.

I would like to pin down the problem before looking at potential solutions. For example, does this problem also arise when someone has installed tbb using the package manager (i.e., no anaconda involved)?

aseyboldt commented 2 years ago

@riddell-stan If this looks so differently on your machine than I think there is something going on that I really don't understand. I thought this was fairly simple and I understood how this happens, but apparently I don't. Some things that I think could help: Run the code with HTTPSTAN_DEBUG so that we can see the actual command line options of the compiler? Look at the RPATH of the created shared library readelf -d {THE_HTTPSTAN_SHARED_LIB}.

riddell-stan commented 2 years ago

readelf does the trick. I see the order now.

Does setting runtime_library_dirs help at all? It would be nice to stick to the traditional distutils calls.

aseyboldt commented 2 years ago

I think that only adds arguments to the end, but because the search paths that conda adds are at the beginning we can not give our paths a higher priority.

riddell-stan @.***> schrieb am Do., 30. Juni 2022, 12:54:

readelf does the trick. I see the order now.

Does setting runtime_library_dirs help at all? It would be nice to stick to the traditional distutils calls.

— Reply to this email directly, view it on GitHub https://github.com/stan-dev/httpstan/pull/601#issuecomment-1171071573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOLSHJEVVECI6R5NZXZY2TVRV4HZANCNFSM5Y6L4KOA . You are receiving this because you were mentioned.Message ID: @.***>

riddell-stan commented 2 years ago

Did some additional digging. Looks like this may be a conda-specific issue.

To make sure, I took an Ubuntu 20.04 LTS host and installed libtbb-dev and libsundials-dev. httpstan ignores the system libraries. You get a clean Library runpath. In this case, I see this:

 0x000000000000001d (RUNPATH)            Library runpath: [/home/riddella/tmp/pystan-test2/venv/lib/python3.8/site-packages/httpstan/lib]
stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.