reframe-hpc / reframe

A powerful Python framework for writing and running portable regression tests and benchmarks for HPC systems.
https://reframe-hpc.readthedocs.org
BSD 3-Clause "New" or "Revised" License
222 stars 103 forks source link

[enhancement] Skip modules system sanity checking when module conflict resolution is off #3054

Closed ekouts closed 11 months ago

ekouts commented 11 months ago

Not sure if this PR is more suitable for master or develop. I can change it in any case.

pep8speaks commented 11 months ago

Hello @ekouts, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2023-12-07 23:33:03 UTC
codecov[bot] commented 11 months ago

Codecov Report

Attention: 22 lines in your changes are missing coverage. Please review.

Comparison is base (82ae99e) 86.70% compared to head (1866e37) 86.61%.

Files Patch % Lines
reframe/core/modules.py 59.25% 22 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #3054 +/- ## =========================================== - Coverage 86.70% 86.61% -0.09% =========================================== Files 61 61 Lines 12002 12035 +33 =========================================== + Hits 10406 10424 +18 - Misses 1596 1611 +15 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ekouts commented 11 months ago

I made he changes that you proposed and added a note about the change in the docs. About:

Also what is the behaviour in the following scenario:

  1. Define a system with a modules system that is not present

You mean with RFM_RESOLVE_MODULE_CONFLICTS=0? In theory Reframe should just emit the commands in the script, not crash. I will also look into the other scenarios that you mentioned and add a unittest.

vkarak commented 11 months ago

For the docs I would not add this explanation in the environment variable or configuration option, but rather in the system's modules_system: https://reframe-hpc.readthedocs.io/en/stable/config_reference.html#config.systems.modules_system, where we should also say that ReFrame will try to verify that the requested modules system is installed unless the RFM_RESOLVE_MODULE_CONFLICTS is unset.

As for the unit tests, we'd need a unit test to verify that no exception is thrown whenever we try to initialize a backend with validate=False.

As for the failure scenario that I mentioned, here it is:

RFM_RESOLVE_MODULE_CONFLICTS=0 ./bin/reframe -C config/machines.py --module foo -c unittests/resources/checks/hellocheck.py -r
[ReFrame Setup]
  version:           4.5.0-dev.1+b62838cc
  command:           './bin/reframe -C config/machines.py --module foo -c unittests/resources/checks/hellocheck.py -r'
  launched by:       karakasv@tresa.local
  working directory: '/Users/karakasv/Repositories/reframe'
  settings files:    '<builtin>', 'config/machines.py'
  check search path: '/Users/karakasv/Repositories/reframe/unittests/resources/checks/hellocheck.py'
  stage directory:   '/Users/karakasv/Repositories/reframe/stage'
  output directory:  '/Users/karakasv/Repositories/reframe/output'
  log files:         '/Users/karakasv/Repositories/reframe/reframe.log'

WARNING: skipping test 'SkipTest': unsupported
ERROR: run session stopped: file not found error: [Errno 2] No such file or directory: 'modulecmd'
Log file(s) saved in '/Users/karakasv/Repositories/reframe/reframe.log'

In the config/machines.py I have set modules_system to tmod. I think we should somehow treat this case and force the validation of the modules system before this point. Maybe, the RFM_RESOLVE_MODULE_CONFLICTS=0 should delay the validation instead of cancelling it, because the message here is confusing.

ekouts commented 11 months ago

I think we should somehow treat this case and force the validation of the modules system before this point. Maybe, the RFM_RESOLVE_MODULE_CONFLICTS=0 should delay the validation instead of cancelling it, because the message here is confusing.

Ok, I moved the validation in the execute function of the module system. We still want Reframe to "crash" but with a more understandable error, right?