RobertLuptonTheGood / eups

A version manager tracking product dependencies
19 stars 19 forks source link

DYLD_LIBARARY_PATH clobbered on El Capitan #53

Closed natelust closed 8 years ago

natelust commented 8 years ago

I begin by sourcing all relevant files. Workflow begins as such:

nate$ setup lsst_apps
nate$ echo $DYLD_LIBRARY_PATH
/Users/nate/lsstsw/lsstsw/stack/DarwinX86/psfex/2015_10.0-1-geb1eb67+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/meas_extensions_psfex/2015_10.0-2-gce306f0/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/xpa/2.1.15.lsst3/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/obs_lsstSim/2015_10.0-2-g37fe777+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/skymap/2015_10.0-3-g2206176/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/coadd_chisquared/2015_10.0-1-g470bb2d+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/ip_diffim/2015_10.0-1-gc9560c6+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/ip_isr/2015_10.0-4-g8823148/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/astrometry_net/0.50.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/meas_astrom/2015_10.0-1-ge277350+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/shapelet/2015_10.0-1-g7c5d349+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/meas_modelfit/2015_10.0-1-g9cc7517+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/coadd_utils/2015_10.0-3-g9b0088f/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/meas_base/2015_10.0-2-gef1b2e0/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/meas_algorithms/2015_10.0-1-g5e1b086+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/gsl/1.16.lsst3/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/minuit2/5.28.00.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/wcslib/4.14.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/cfitsio/3360.lsst4/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/fftw/3.3.4.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_config/2015_10.0-1-gc006da1+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_policy/2015_10.0-1-gd4eb80a+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_logging/2015_10.0-1-gf6c85b3+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/mysqlclient/5.1.73.lsst3/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/daf_persistence/2015_10.0-5-g4063539/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/daf_base/2015_10.0-3-g4d32b56/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/afw/2015_10.0-8-g4057726/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_exceptions/2015_10.0-1-gf7aee88+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/boost/1.59.lsst5/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/base/2015_10.0-1-g1d153f6+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/utils/2015_10.0-3-g530cd83/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/meas_deblender/2015_10.0-3-ga5a97e7/lib

Next move to a package that I have cloned, in this case meas_base.

nate$ setup -j -r .
nate$ echo $DYLD_LIBRARY_PATH
/Users/nate/repos_lsst/meas_base/lib

Reset the path with:

nate$ setup lsst_apps

If I were to run with out the -j options:

/Users/nate/repos_lsst/meas_base/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/coadd_utils/2015_10.0-3-g9b0088f/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/afw/2015_10.0-8-g4057726/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/gsl/1.16.lsst3/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/minuit2/5.28.00.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/wcslib/4.14.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/cfitsio/3360.lsst4/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/fftw/3.3.4.lsst2/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_config/2015_10.0-1-gc006da1+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_policy/2015_10.0-1-gd4eb80a+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_logging/2015_10.0-1-gf6c85b3+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/mysqlclient/5.1.73.lsst3/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/daf_persistence/2015_10.0-5-g4063539/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/pex_exceptions/2015_10.0-1-gf7aee88+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/boost/1.59.lsst5/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/base/2015_10.0-1-g1d153f6+1/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/utils/2015_10.0-3-g530cd83/lib:/Users/nate/lsstsw/lsstsw/stack/DarwinX86/daf_base/2015_10.0-3-g4d32b56/lib

All the dependancies are setup, but nothing previously existed in the DYLD_LIBRARY_PATH is preserved.

This is all due to the shebang line in eups/current/bin/eups_setup which uses the env variable. In El Capitan, system utilities filter out all binary library paths when executing.

One possible solution is to have an alternative environment variable which tracks packages and then copies its value to DYLD_LIBRARY_PATH after execution. Something akin to:

dosetup() { setup "$@" ; export DYLD_LIBRARY_PATH=$LSST_LIBRARY_PATH; }
timj commented 8 years ago

Just a thought, but does it work if you just make the first line of eups_setup blank and use the default POSIX shell? Using env to locate sh is not really required.

RobertLuptonTheGood commented 8 years ago

This was all added to work around problems with using eups on a cluster. I don't remember the exact problem (it should be in the logs)

timj commented 8 years ago

I don't mean "just use python in the shebang" I mean just have a blank line for the shebang.

timj commented 8 years ago

@natelust I can confirm that removing the shebang and just leaving it blank fixes the problem for me.

r-owen commented 8 years ago

This also fixed a problem I was having. I think all OS X 10.11 users will run into this. Could we please increase the priority? Perhaps the file can be written differently on OS X?

mjuric commented 8 years ago

I'm looking at the commit that introduced this -- c0db8a85a3461b08b1c121ce1dbf192cdf5244fb -- it looks like it should be safe to remove the shebang line.

@timj @r-owen Are you saying that DYLD_LIBRARY_PATH is retained if there's no shebang line? That feels like fragile (shell-specific?) behavior?

timj commented 8 years ago

If you read DMTN-001 you will see the surprising result that Apple does not invoke SIP when a default shell is started. Surprising I know and we aren't sure whether it's deliberate or not but for now it works fine and should be portable. It avoids EUPS having to introduce an EUPS_LIBRARY_PATH environment variable that it has to use internally when handling special LD_LIBRARY_PATH-like variables. We can worry about that if Apple close the loophole.

mjuric commented 8 years ago

There's some mildly enlightening discussion of scripts-with-no-shebang here and here.

mjuric commented 8 years ago

Interesting. So is just #!/bin/sh an option?

timj commented 8 years ago

No. You can't specify anything in the first line. If you use #! then SIP comes into play because /bin is a protected directory. Scripts with no shebang is definitely an historically oddity (although I think it helps POSIX-compliance on Windows).

mjuric commented 8 years ago

Hmm, I worry if this is a lucky side effect of how bash executes scripts w/o a shebang -- probably does the internal equivalent of '{ source foo.sh }', which doesn't trigger SIP.

All that aside, looking at the intent of this code I'm actually thinking there's no reason for eups_setup to exist at all. The python invocation can be done directly in the setup alias. That would solve the problem, I think?

timj commented 8 years ago

I'm sure it's a lucky side effect but it's quick and we can worry about it later. The other fix is indeed to remove the entire trampoline and just burn in the python path during install time in the python script shebang itself. I wasn't going to do that myself because of the comment in the file indicating that there is some mysterious reason for the trampoline.

mjuric commented 8 years ago

My point is that I worry about how other shells do it. Could you test if it works with zsh, ksh and tcsh ?

(I'd do it myself, but still working on bootstrapping an El Capitan VM...)

timj commented 8 years ago

You win :smile: It only works on bash. SIP is enabled for all other shells and the trick fails for those: other environment variables survive but DYLD_LIBRARY_PATH is expunged if I run a shebang-less script from any shell other than bash.

RobertLuptonTheGood commented 8 years ago

We need to understand that trampoline! It was asked for to run on some HPC machine, and now's the time to dredge my email and recover the logic.

mjuric commented 8 years ago

@RobertLuptonTheGood

I think this was to properly hardcode the system python path, but I think we could’ve just as easily done it when expanding eups.table.in, i.e., do:

  addAlias(setup,   eval `unset PYTHONSTARTUP; @EUPS_PYTHON@ -S \”${EUPS_DIR}/bin/eups_setup_impl.py\” \”$@\”`);

?

I think this may have been an implementation choice.

mjuric commented 8 years ago

Someone with El Capitan: could you try if the version on el-capitan-setup-impl branch fixes the problem (you'll need to rerun ./configure ..., as it will regenerate the setups.c?sh scripts).

mjuric commented 8 years ago

Fixed by PR #61 (reopen if it's still causing trouble).