star-bnl / star-sw

Core software for STAR experiment
26 stars 63 forks source link

Incosistent values in USE_64BITS when using `starver` to setup 32 and 64 bit environments #573

Closed plexoos closed 9 months ago

plexoos commented 9 months ago

When calling starver with a config we don't expect it to modify the USE_64BITS environment variable. However, USE_64BITS seems to being reset on every other call of starver.

Using one can reproduce the issue

$ setup 32b
$ starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS
$ setup 64b
$ starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS
USE_64BITS: Undefined variable.
$ setup 64b
$ starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS
$ starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS
USE_64BITS: Undefined variable.
$ setup 32b
$ starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS
$ starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS
USE_64BITS: Undefined variable.
genevb commented 9 months ago

I tried this:

[rcas6003] ~/> cat ddd setup 32b starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS setup 64b starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS setup 64b starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS setup 32b starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS starver dev config/v0.2.3-rhel7-root5.34.38 && echo $USE_64BITS

[rcas6003] ~/> source ddd 0 0 0 0 0 0 [rcas6003] ~/>

And then I tried with the suffixes as they actually are in DEV right now...

[rcas6003] ~/> cat ddd setup 32b starver dev config/v0.2.3-rhel7-root5.34.38-32b && echo $USE_64BITS setup 64b starver dev config/v0.2.3-rhel7-root5.34.38-64b && echo $USE_64BITS setup 64b starver dev config/v0.2.3-rhel7-root5.34.38-64b && echo $USE_64BITS starver dev config/v0.2.3-rhel7-root5.34.38-64b && echo $USE_64BITS setup 32b starver dev config/v0.2.3-rhel7-root5.34.38-32b && echo $USE_64BITS starver dev config/v0.2.3-rhel7-root5.34.38-32b && echo $USE_64BITS

[rcas6003] ~/> source ddd 0 1 1 1 0 0 [rcas6003] ~/>

Looks OK to me. -Gene

plexoos commented 9 months ago

If you run it without a suffix, like this:

starver dev config/v0.2.3-rhel7-root5.34.38

make sure the corresponding config file is in your current directory:

ls mgr/config/
v0.2.3-rhel7-root5.34.38-32b.config  v0.2.3-rhel7-root5.34.38-64b.config  v0.2.3-rhel7-root5.34.38.config

The ones with suffixes do seem to work when called sequentially. It still puzzles me why checking the variable inside the script changes it...

genevb commented 9 months ago

Hi, Dmitri

I'm not sure how you set this up, but invoking starver sets the $STAR environment variable to the official AFS (or CVMFS) locations, and I think the config file will then be taken from under $STAR/mgr, not a local mgr.

I tried running a custom version of .starver that used a local mgr and it seemed to work fine with the custom config file that you proposed in #572.


plexoos commented 9 months ago

Hi Gene, To test the config file I just check out the PR #572 branch locally on an rcas node to make sure I have mgr/config/v0.2.3-rhel7-root5.34.38.config in my current working directory and then run starver dev config/v0.2.3-rhel7-root5.34.38. The logic in $GROUP_DIR/.starver seems to prefer the local file over the one in $STAR if it exists:

$ grep "always search for a local path" -A12 -B6 ${GROUP_DIR}/.starver
# once we have loaded the env, we could check for cf
unsetenv STAR_CONFIG
#echo "Checking $cf"
if ( "$cf" != "") then
    # always search for a local path
    if ( -e "mgr/$cf.config") then
        setenv STAR_CONFIG  "$cf"
        setenv STAR_CONFIGP "`pwd`/mgr/$cf.config"
    else if ( -e "$STAR/mgr/$cf.config" ) then
        setenv STAR_CONFIG  "$cf"
        setenv STAR_CONFIGP "$STAR/mgr/$cf.config"
        # we found nothing and need to revert to default
        set cf=""
        goto CFGCHK
plexoos commented 9 months ago

What did you modify in your custom version of .starver? Could it be that your modifications fix the problem?

genevb commented 9 months ago

I just did a global replace of "$STAR/mgr" with "/star/u/stareco/DDD/mgr" where DDD was the directory I tested in.


On Aug 1, 2023, at 5:46 PM, Dmitri Smirnov @.***> wrote: What did you modify in your custom version of .starver? Could it be that your modifications fix the problem?

plexoos commented 9 months ago

Ok, I did the same exercise with a local copy of starver and confirm that my original test works as expected. If this exercise resembles the production environment and how starver is used when installed in /afs then I don't see any further blockers for that new config. Let's merge #572 now

genevb commented 9 months ago

I tested after the merge of #572, and here's what I see: every other invocation of stardev causes a problem:

[rcas6003] ~/> stardev [rcas6003] ~/> stardev USE_64BITS: Undefined variable. [rcas6003] ~/> stardev [rcas6003] ~/> stardev USE_64BITS: Undefined variable. [rcas6003] ~/> stardev [rcas6003] ~/> stardev USE_64BITS: Undefined variable. [rcas6003] ~/> stardev [rcas6003] ~/> stardev USE_64BITS: Undefined variable.

I'll have to back out that change to the link of the default configuration for now :-(


On Aug 1, 2023, at 11:26 PM, Dmitri Smirnov @.***> wrote:

Ok, I did the same exercise with a local copy of starver and confirm that my original test works as expected. If this exercise resembles the production environment and how starver is used when installed in /afs then I don't see any further blockers for that new config. Let's merge #572 now

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

genevb commented 9 months ago

A little digging shows that the environment variable $USE_64BITS must be unset inside this:

eval /usr/bin/modulecmd tcsh purge

...which is part of "module purge". Replacing with just:

/usr/bin/modulecmd tcsh purge

...does not cause the environment variable to be unset, but does result in printing the following:

setenv LD_LIBRARY_PATH .sl73_gcc485/LIB:.sl73_gcc485/lib:/afs/ ;setenv MANPATH /opt/star/sl73_gcc485/man:/afs/ ;setenv PATH /opt/star/sl73_gcc485/bin:/star/u/genevb/bin:/usr/lib/jvm/jre-openjdk/bin:/afs/ ;unsetenv ACLOCAL_PATH;unsetenv BOOST_ROOT;unsetenv CMAKE_PREFIX_PATH;unsetenv CPATH;unsetenv LOADEDMODULES;unsetenv PKG_CONFIG_PATH;unsetenv ROOTSYS;unsetenv ROOT_INCLUDE_PATH;unsetenv ROOT_VERSION;unsetenv USE_64BITS;unsetenv Vc_DIR;unsetenv XLOCALEDIR;

One can see that near the end is the call to unsetenv USE_64BITS, so calling eval on this output effectively unsets that variable (and many others).

A potential solution may be to alter the default config from...

module purge if ( $USE_64BITS == 1 ) then module unuse /cvmfs/ module use /cvmfs/ else module unuse /cvmfs/ module use /cvmfs/ endif

if ( $USE_64BITS == 1 ) then module purge module unuse /cvmfs/ module use /cvmfs/ else module purge module unuse /cvmfs/ module use /cvmfs/ endif

Seems to work for me in this regard, but I'm unsure of whether there are other consequences.


plexoos commented 9 months ago

Good catch! This likely explains the undefined USE_64BITS on every other invocation. The fix looks fine, let's submit it.

plexoos commented 9 months ago

Just confirmed that after #574 the following works as expected for me on rcas:

setup 32b
setup 64b