TACC / Lmod

Lmod: An Environment Module System based on Lua, Reads TCL Modules, Supports a Software Hierarchy
http://lmod.readthedocs.org
Other
489 stars 126 forks source link

LMOD_TMOD_FIND_FIRST makes Lmod error when hidden module of the same name is in a higher path #713

Closed AcerP-py closed 2 months ago

AcerP-py commented 2 months ago

Describe the bug LMOD_TMOD_FIND_FIRST makes Lmod error when hidden module of the same name is in a higher path. Take for example the following module layout.

.
├── bottom
│   ├── A
│   │   └── 1.2.3.lua
│   └── B
│       └── 9.8.7.lua
└── top
    ├── A
    │   └── 1.2.3-mpi.lua
    └── B
        └── .9.8.7-iueworo.lua
---------------------------------------------------------------------------------------------------------------------------------------------- /tmp/module_example/top -----------------------------------------------------------------------------------------------------------------------------------------------
   A/1.2.3-mpi

--------------------------------------------------------------------------------------------------------------------------------------------- /tmp/module_example/bottom ---------------------------------------------------------------------------------------------------------------------------------------------
   A/1.2.3    B/9.8.7

Module defaults are chosen based on Find First Rules due to Name/Version/Version modules found in the module tree.
See https://lmod.readthedocs.io/en/latest/060_locating.html for details.

If the avail list is too long consider trying:

"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

To Reproduce Steps to reproduce the behavior: Given the above setup run a module load B

$ ml load B
Lmod has detected the following error:  These module(s) or extension(s) exist but cannot be loaded as requested: "B"
   Try: "module spider B" to see how to load the module(s).

However a module load B/9.8.7 works as expected.

Expected behavior We would expect Lmod to transverse the MODULEPATH until it finds a module that it can load without having to specify version when LMOD_TMOD_FIND_FIRST is set.

Debug Info: Attached is the debug output which should cover all the needed info. out.txt

Additional context The problem seems to be that Lmod sees the hidden module in the top directory and then chooses to ignore all other directories before it has validated that the module in the first directory is actually loadable.

rtmclay commented 2 months ago

Thanks for the bug report! I was able to reproduce this issue and fix it for me. Please test the Lmod branch "IS713-tmod-hidden" to see if it works for you.

This bug has been around for a long time. Probably since Lmod 7 was introduced. The fix here is to let the routine l_find_highest_by_key() in src/MName.src to search all possible choices rather than just the first directory that has the module_name that is searching for. Thanks again for the bug report!

AcerP-py commented 2 months ago

It works for me too. Thanks!

rtmclay commented 2 months ago

This fix is now live in Lmod 8.7.45. Closing this issue.