reframe-hpc / reframe

A powerful Python framework for writing and running portable regression tests and benchmarks for HPC systems.
https://reframe-hpc.readthedocs.org
BSD 3-Clause "New" or "Revised" License
222 stars 103 forks source link

Fixture fails when more than one instance of the same class are specified #3285

Open tom91136 opened 4 weeks ago

tom91136 commented 4 weeks ago

It seems that we can't specify two fixtures of the same class even with different variables. Here's a minimal reproducer:

import reframe as rfm
from reframe.core.builtins import variable

class task1(rfm.RunOnlyRegressionTest):
    executable = "echo"
    myvar = variable(str)

    @run_before("setup")
    def after_setup(self):
        print(f"build {self.myvar}")

    @sanity_function
    def validate(self):
        return True

@rfm.simple_test
class task2(rfm.RunOnlyRegressionTest):
    valid_systems = ["*"]
    valid_prog_environs = ["*"]
    executable = "echo"

    dep1 = fixture(task1, scope="environment", variables={"myvar": "a"})
    dep2 = fixture(task1, scope="environment", variables={"myvar": "b"})

    @run_after("setup")
    def after_setup(self):
        print(f"run {self.dep1.myvar}")
        print(f"run {self.dep2.myvar}")

    @sanity_function
    def validate(self):
        return True

And we get:

> reframe  -c src/perf/check.py  -r  -v
[ReFrame Setup]
  version:           4.6.2
  command:           '/home/tom/.local/bin/reframe -c src/perf/check.py -r -v'
  launched by:       tom@soraws-uk
  working directory: '/home/tom/reframe_test'
  settings files:    '<builtin>'
  check search path: '/home/tom/reframe_test/src/perf/check.py'
  stage directory:   '/home/tom/reframe_test/stage'
  output directory:  '/home/tom/reframe_test/output'
  log files:         '/tmp/rfm-kdeqh8wz.log'

Loaded 3 test(s)
Generated 3 test case(s)
Filtering test cases(s) by name: 1 remaining
Filtering test cases(s) by tags: 1 remaining
Filtering test cases(s) by other attributes: 1 remaining
Final number of test cases: 3
[==========] Running 3 check(s)
[==========] Started on Sun Oct 20 04:52:26 2024+0100

[----------] start processing checks
[ RUN      ] task1 %myvar=b ~generic:default+builtin /307d74c0 @generic:default+builtin
build b
[ RUN      ] task1 %myvar=a ~generic:default+builtin /d78dd560 @generic:default+builtin
build a
[       OK ] (1/3) task1 %myvar=b ~generic:default+builtin /307d74c0 @generic:default+builtin
==> setup: 0.009s compile: 0.011s run: 0.254s sanity: 0.003s performance: 0.002s total: 0.302s
[       OK ] (2/3) task1 %myvar=a ~generic:default+builtin /d78dd560 @generic:default+builtin
==> setup: 0.008s compile: 0.011s run: 0.254s sanity: 0.003s performance: 0.002s total: 0.288s
[ RUN      ] task2 %dep2.myvar=b %dep1.myvar=a /bf1743ec @generic:default+builtin
[     FAIL ] (3/3) task2 %dep2.myvar=b %dep1.myvar=a /bf1743ec @generic:default+builtin
==> test failed during 'setup': test staged in '/home/tom/reframe_test/stage/generic/default/builtin/task2_bf1743ec'
==> setup: 0.008s compile: n/a run: n/a sanity: n/a performance: n/a total: 0.008s
[----------] all spawned checks have finished

[  FAILED  ] Ran 3/3 test case(s) from 3 check(s) (1 failure(s), 0 skipped, 0 aborted)
[==========] Finished on Sun Oct 20 04:52:26 2024+0100
============================================================================================================================================================================================================================================================================================================================
SUMMARY OF FAILURES
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
FAILURE INFO for task2 %dep2.myvar=b %dep1.myvar=a (run: 1/1)
  * Description: 
  * System partition: generic:default
  * Environment: builtin
  * Stage directory: /home/tom/reframe_test/stage/generic/default/builtin/task2_bf1743ec
  * Node list: 
  * Job type: local (id=None)
  * Dependencies (conceptual): ['task1_987e86a2', 'task1_36243591']
  * Dependencies (actual): [('task1_987e86a2', 'generic:default', 'builtin'), ('task1_36243591', 'generic:default', 'builtin')]
  * Maintainers: []
  * Failing phase: setup
  * Rerun with '-n /bf1743ec -p builtin --system generic:default -r'
  * Reason: pipeline error: fixture 'dep2' has more than one instances
Traceback (most recent call last):
  File "/home/tom/.local/lib/python3.12/site-packages/reframe/frontend/executors/__init__.py", line 317, in _safe_call
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/tom/.local/lib/python3.12/site-packages/reframe/core/hooks.py", line 109, in _fn
    func(obj, *args, **kwargs)
  File "/home/tom/.local/lib/python3.12/site-packages/reframe/core/pipeline.py", line 2571, in setup
    self._resolve_fixtures()
  File "/home/tom/.local/lib/python3.12/site-packages/reframe/core/pipeline.py", line 1648, in _resolve_fixtures
    raise PipelineError(
reframe.core.exceptions.PipelineError: fixture 'dep2' has more than one instances

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Log file(s) saved in '/tmp/rfm-kdeqh8wz.log'

Looking at the error, it seems that it's resolving the fixture purely with class names. If we add the following extra check:

if fixt_data.variables != f.variables:
    continue

To the resolution logic: https://github.com/reframe-hpc/reframe/blob/a00fa32991a71c84eed73a3d90731311082ea74e/reframe/core/pipeline.py#L1642-L1655 Then the reproducer works with the correct fixtures being executed.

However, I imagine this raises the question on variable identity so will probably need more testing. I'm happy to open a quick PR if this is indeed the correct way to solve this.

tom91136 commented 6 days ago

@vkarak Should I open a PR?

vkarak commented 5 days ago

Hi, sorry for the late reply. This seems indeed to be a bug, but I'm not sure yet if that's the correct fix or if it's just hiding a more fundamental reason. Feel free to open a PR with your fix and we'll see if it breaks other unit tests.