interuss / monitoring

InterUSS Platform USS monitoring tools for federated UTM, including automated testing.
Apache License 2.0
11 stars 19 forks source link

[scenario failure] configurations.dev.message_signing fails #693

Open Shastick opened 5 months ago

Shastick commented 5 months ago

Describe the bug

When run locally with the message_signing configuration, the uss_qualifier runs into a failed check.

To reproduce

This happens locally from main, on an M2 Macbook Pro.

Steps to reproduce the behavior:

  1. make restart-all
  2. cd monitoring/uss_qualifier
  3. ./run_locally.sh configurations.dev.message_signing

Difference from expected behavior

The test message_signing test suite is expected to complete successfully, similarly to how it completes successfully in CI.

Possible solution

Possibly related to flight intents configurations around start/end times, or the recent update of some APIs and assumptions around the mock_uss done by @mickmis . It's a wild guess though.

System on which behavior was encountered OS X, M2 Macbook pro.

Codebase information

% git log -n 1
commit 2f5ca5b4965432a53be5426a4b8bb44024b3d5da (HEAD -> main, origin/main, origin/HEAD)
Author: Julien Perrochet <julien.perrochet@orbitalize.com>
Date:   Tue May 14 20:08:22 2024 +0200

    [uss_qualifier] new scenario to check for forbidden OIR state transitions (#676)

    [uss_qualifier] expand SCD scenario to check for forbidden OIR state transisionts

Output of git status:

% git status
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

Additional context

Attached sequence from the report: sequence.zip

Screenshot 2024-05-17 at 22 04 24
2024-05-17 19:59:04.144 | WARNING  | monitoring.uss_qualifier.suites.suite:_print_failed_check:73 - New failed check:
  details: 'mock_uss indicated flight planning activity PlanningActivityResult.Rejected
    leaving flight plan FlightPlanStatus.NotPlanned rather than the expected (Activity
    PlanningActivityResult.Completed, flight plan FlightPlanStatus.Planned) with no
    notes

    Severity Severity.High upgraded to Critical because `stop_fast` flag set true in
    configuration'
  documentation_url: https://github.com/interuss/monitoring/blob/2f5ca5b4965432a53be5426a4b8bb44024b3d5da/monitoring/uss_qualifier/scenarios/flight_planning/plan_flight_intent.md#successful-planning-check
  name: Successful planning
  participants:
  - mock_uss
  query_report_timestamps:
  - '2024-05-17T19:59:04.115509Z'
  requirements:
  - interuss.automated_testing.flight_planning.ExpectedBehavior
  severity: Critical
  summary: Flight planning activity unexpectedly PlanningActivityResult.Rejected leaving
    flight plan FlightPlanStatus.NotPlanned
  timestamp: '2024-05-17T19:59:04.143692Z'

2024-05-17 19:59:04.144 | WARNING  | monitoring.uss_qualifier.suites.suite:_run_test_scenario:179 - FAILURE for "Nominal planning: conflict with higher priority" scenario
BenjaminPelletier commented 5 months ago

This scenario is failing in the CI as well because of a mismatch between test configuration and mock environment configuration. uss_qualifier is requesting that the USS plan a flight with priority 100, but the mock_uss being instructed to plan is configured to follow the rules of a locality that doesn't have this priority level (likely US Industry Collaboration). The reason this is not causing the CI to fail is that the message_signing configuration has no validation block to return an error code from uss_qualifier (which is what CI success is based on). There are (at least) 3 ways to fix this:

  1. Change the locality of mock_uss to a locality that supports higher-priority planning
  2. In the message_signing configuration, don't provide any flight intent resources to the message_signing suite which contain high-priority flights
  3. Remove the message_signing configuration from the CI as it is not being maintained, and add a note in the configuration file that it is not being maintained

I think option 2 is probably the best compromise, but we can keep 3 in our back pocket if things become too hard to maintain. @punamverma for visibility as well

For options 1 or 2, we should add a validation block that so that this configuration will cause CI failures when it's broken.

punamverma commented 5 months ago

@BenjaminPelletier Option 2 looks good to me.

Shastick commented 3 months ago

While not providing flight intents containing flight with different priorities solved some of the failed checks, there is still one issue remaining.

The GetOpResponseDataValidationByUSS scenario is consistently failing in one of these ways:

Here is a screenshot for the first case:

Screenshot 2024-06-28 at 18 48 35

And here if the configuration is moved up so the scenario runs first:

Screenshot 2024-06-28 at 18 49 53

My current understanding is the following:

BenjaminPelletier commented 3 months ago

Our mock_uss should behave correctly, so this is certainly a bug in either mock_uss or the scenario (my money would be on the scenario). A scenario should always succeed regardless of where in the test run it is executed unless there is an explicit note regarding dependencies (e.g., if a scenario documents that it is expecting a clear planning area ensured via PrepareFlightPlanners). To my knowledge, no SSL is used in the CI, so I would be surprised if this is an SSL issue -- what are you seeing that suggests SSL as a problem?