Bright-Computing / bic

Bright-Illumina collaboration
GNU General Public License v2.0
4 stars 5 forks source link

LMOD_CACHED_LOADS seems to trigger a bug when migrating 7.7.8 -> 7.7.14 on centos7 but not centos6 #107

Closed fgeorgatos closed 6 years ago

fgeorgatos commented 6 years ago

interestingly, this is not visible under centos6; other than that everything else works fine: (this could be an upstream Lmod issue, but I have not cornered it yet - TBD)

$ ssh node005
Last login: Fri Jan 19 03:26:35 2018 from somewhere.cm.cluster

Lmod has detected the following error:  Unable to load module:
     /etc/site/modules/settarg/7.7.8.lua: Empty or non-existant file

While processing the following module(s):
    Module fullname  Module Filename
    ---------------  ---------------
    settarg/7.7.8    /etc/site/modules/settarg/7.7.8.lua
rtmclay commented 6 years ago

This is not a bug. If you use LMOD_PIN_VERSIONS=yes, then you'll see this issue. The settarg module is versioned and comes with the source. So if you install Lmod 7.7.14, then the modulefiles lmod/7.7.14 and settarg/7.7.14. Since you have replaced 7.7.8 with 7.7.14, then the old versions associated 7.7.8 are GONE!. That is modulefile settarg/7.7.8 is gone.

Now maybe the lmod and settarg modules shouldn't be versioned since there only one active version of each. If a future version of Lmod changed settarg and lmod to not be versioned, you'd still get this message about not being able to find settarg/7.7.8, but you'd only get it once during the transition from versioned to non-versioned settarg modules.

fgeorgatos commented 6 years ago

@rtmclay : alright, this is an area I don't fully understand, how the software bits are put together; but then again, isn't it strange that the bug only exhibits itself under centos7 but NOT under centos6? I was caught by surprise, by the differing behavior...

fgeorgatos commented 6 years ago

@rtmclay: I've tried the same on a system with LMOD_PIN_VERSIONS=no (and also yes) and it did not make the difference; however, disabling LMOD_CACHED_LOADS seems to do the trick.

rtmclay commented 6 years ago

I gave the wrong explanation but it is still a user bug and not an Lmod bug. When you upgrade Lmod, you also have to update the cache. The cache only knows about settarg/7.7.8 which doesn't exist after the upgrade.

fgeorgatos commented 6 years ago

OK, I think I might understand better now what is going on here; so, if we would let the cache to expire, say, after 24hours, the issue goes away? interesting experiment to try then! (but again puzzled for the centos6 vs centos7 diff... we'll see)

fgeorgatos commented 6 years ago

ok, resolving this, since it was related to a caching effect across nodes; but yes, it would be a good idea to not hardwire the version as parameter to settarg !