ICRAR / daliuge

The DALiuGE Execution Engine
GNU Lesser General Public License v2.1
24 stars 7 forks source link

LIU-396: Add integration test for end-to-end partitioning and deployment on multiple nodes #281

Closed myxie closed 2 weeks ago

myxie commented 2 months ago

JIRA Ticket

Note: This needs to wait until #283 is merged before this can be merged.

LIU-396

Type

We have recently had issues running on Setonix, which were caused by some port issues that weren't properly tested. The tests currently have two node managers running, but we don't test the complete graph-submissiong through to deployment in the unit tests.

Solution

This test loads a graph, partitions it, and then deploys it to two different Node Managers, each running on different ports. This serves to test both the new configurable ports, as well as ensures we have test coverage for # Checklist - [X] Unittests added - ~[ ] Documentation added~ - Will be addressed in follow-up PR [LIU-396]: https://icrar.atlassian.net/browse/LIU-396?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ ## Summary by Sourcery Add an integration test to verify the deployment of a partitioned graph across multiple node managers, addressing recent issues with port configurations. Refactor test utilities to improve code reuse and maintainability. New Features: - Introduce integration test for deploying a partitioned graph across multiple node managers. Enhancements: - Refactor test utilities by consolidating common functions into a new module, improving code reuse and maintainability. Tests: - Add a new integration test to verify the deployment of a partitioned graph across multiple node managers, ensuring proper functionality of configurable ports.
coveralls commented 4 weeks ago

Coverage Status

coverage: 79.661%. first build when pulling c727883fc79e8237d2274c74f7f239739be9d0dd on LIU-396-Tests into 4be953f904836c16b80ca4319e4e5f8e439dc77a on master.

sourcery-ai[bot] commented 2 weeks ago

Reviewer's Guide by Sourcery

This pull request adds an integration test for multiple nodes and a partitioned graph. It involves significant refactoring of test utilities, moving common test code into shared modules, and updating import statements across multiple files. The changes aim to improve test coverage for configurable ports and multi-node deployments.

Sequence diagram for graph deployment across multiple Node Managers

sequenceDiagram
    participant Test as Test Suite
    participant DM1 as NodeManager 1
    participant DM2 as NodeManager 2
    participant DIM as DataIslandManager

    Test->>DM1: createSession(sessionId)
    Test->>DM1: addGraphSpec(sessionId, g1)
    Test->>DM1: add_node_subscriptions(sessionId, {nm_conninfo(1): rels})
    Test->>DM1: deploySession(sessionId)

    Test->>DM2: createSession(sessionId)
    Test->>DM2: addGraphSpec(sessionId, g2)
    Test->>DM2: add_node_subscriptions(sessionId, {nm_conninfo(0): rels})
    Test->>DM2: deploySession(sessionId)

    Test->>DIM: createSession("TestSession")
    Test->>DIM: addGraphSpec("TestSession", pg_spec)
    Test->>DIM: deploySession("TestSession", completedDrops=roots)

    DIM-->>Test: getGraphStatus("TestSession")

Class diagram for updated test utilities

classDiagram
    class DROPManagerUtils {
        +nm_conninfo(n, return_tuple)
        +add_test_reprodata(graph)
        +quickDeploy(nm, sessionId, graphSpec, node_subscriptions)
    }

    class NMTestsMixIn {
        -_dms: dict
        -use_processes: bool
        +_start_dm(threads, kwargs)
        +tearDown()
        +host_names: list
        +start_hosts(num_hosts, threads)
        +_test_runGraphInTwoNMs(g1, g2, rels, root_data, leaf_data, root_oids, leaf_oid, expected_failures, sessionId, node_managers, threads)
    }

    DROPManagerUtils <|-- NMTestsMixIn
    NMTestsMixIn o-- NodeManager
    NMTestsMixIn o-- DROPManagerUtils

File-Level Changes

Change Details Files
Added a new integration test for multiple nodes and a partitioned graph
  • Created a new test file 'test_graph_to_manager.py'
  • Implemented TestGraphLoaderToNodeManager class with setup, teardown, and test methods
  • Added logic to partition a graph using METIS algorithm
  • Implemented test case to deploy and verify graph execution across multiple Node Managers
daliuge-engine/test/deploy/test_graph_to_manager.py
Refactored test utilities and constants into separate modules
  • Created new files for test utilities (dlg_engine_testutils.py) and constants (dlg_engine_testconstants.py)
  • Moved common test code like NMTestsMixIn, DROPManagerUtils, and RESTTestUtils into dlg_engine_testutils.py
  • Defined DEFAULT_TEST_REPRO and DEFAULT_TEST_GRAPH_REPRO in dlg_engine_testconstants.py
daliuge-engine/test/dlg_engine_testutils.py
daliuge-engine/test/dlg_engine_testconstants.py
Updated import statements and removed duplicate code across multiple test files
  • Modified import statements to use the new test utility modules
  • Removed redundant code that was moved to the shared utility modules
  • Updated class inheritance to use the refactored NMTestsMixIn
daliuge-engine/test/manager/test_dm.py
daliuge-engine/test/manager/test_dim.py
daliuge-engine/test/manager/test_rest.py
daliuge-engine/test/manager/test_scalability.py
daliuge-engine/test/deploy/test_common.py
daliuge-engine/test/test_logs.py
daliuge-engine/test/apps/test_dynlib.py
daliuge-engine/test/apps/test_pyfunc.py
daliuge-engine/test/apps/test_dynlib2.py
Modified existing manager and utility classes to support the new test structure
  • Updated NodeManager initialization in NMTestsMixIn
  • Modified ManagerStarter class to accept additional port parameters
  • Updated CompositeManagerRestServer to use the renamed addGraphSpec method
daliuge-engine/dlg/testutils.py
daliuge-engine/dlg/manager/composite_manager.py
daliuge-engine/dlg/manager/rest.py
Added a pull request template
  • Created a new pull request template file with sections for JIRA ticket, type of change, problem description, solution, and checklist
.github/pull_request_template.md

Tips and commands #### Interacting with Sourcery - **Trigger a new review:** Comment `@sourcery-ai review` on the pull request. - **Continue discussions:** Reply directly to Sourcery's review comments. - **Generate a GitHub issue from a review comment:** Ask Sourcery to create an issue from a review comment by replying to it. - **Generate a pull request title:** Write `@sourcery-ai` anywhere in the pull request title to generate a title at any time. - **Generate a pull request summary:** Write `@sourcery-ai summary` anywhere in the pull request body to generate a PR summary at any time. You can also use this command to specify where the summary should be inserted. #### Customizing Your Experience Access your [dashboard](https://app.sourcery.ai) to: - Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others. - Change the review language. - Add, remove or edit custom review instructions. - Adjust other review settings. #### Getting Help - [Contact our support team](mailto:support@sourcery.ai) for questions or feedback. - Visit our [documentation](https://docs.sourcery.ai) for detailed guides and information. - Keep in touch with the Sourcery team by following us on [X/Twitter](https://x.com/SourceryAI), [LinkedIn](https://www.linkedin.com/company/sourcery-ai/) or [GitHub](https://github.com/sourcery-ai).
myxie commented 2 weeks ago

@awicenec this PR adds the unit tests we discussed a few months ago (higher priority work came up in between). I believe there's no major changes to the actual features of this PR, just the test infrastructure.