Closed Bravo555 closed 7 months ago
Giving a bit of background here on the test framework...
The test frameworks (both the robotframework-c8y and robotframework-devicelibrary use "smart assertions", where an assertion is automatically retried x amount of times until a total duration (timeout) has been used. System tests are generally very flakey as they rely on interactions with real servers where the performance can vary (e.g. a device can be registered in the platform anywhere between 1s to 10s, depending on how loaded the server is...(not exact numbers but more to illustrate the point). So having the smart assertions helps cut down on this flakiness, to make the overall system tests more reliable.
You can think of the "smart assertions" as "eventually I expect this condition to be true" rather than "the condition should be true now!".
The retry mechanism which coordinates how often the failed assertion is retried is defined for the robotframework-devicelibrary, here: https://github.com/reubenmiller/device-test-core/blob/main/device_test_core/retry.py.
So back to the actual problem you are facing...
If you look at the test output, the mqtt-logger is still finding an MQTT message within in the search space even with an explicit 5 second delay between publishing the last measurement...so there are two explanations for this:
Should Have MQTT Messages
assertion uses the timestamp of the start of the test, rounded down to the nearest second, and since the message you are the assertion is detecting is 0.6 seconds paste the rounded down time, it would suggested that the rounded down time is the main issue.You have a few options to fix this, but the core of the problem comes down to the fact that the tests cases are not isolated enough from each other (as the tests are using a suite setup instead of a test setup).
You could solve this by being more explicit with the dateFrom that is being used in the assertion:
Option 1: Use filter on the mqtt message assertion
Should Have MQTT Messages az/messages/events/# minimum=0 maximum=0 message_contains=temperature
Option 2: Be explicit with the time range used in the mqtt message assertion
Publish measurements to unsubscribed topic
${date_from}= Get Unix Timestamp
Execute Command tedge mqtt pub te/device/main///m/ '{"temperature": 10}'
Sleep 5s reason=If a message is not published in 5s, it will never be published.
Should Have MQTT Messages az/messages/events/# minimum=0 maximum=0 date_from=${date_from}
The test frameworks (both the robotframework-c8y and robotframework-devicelibrary use "smart assertions", where an assertion is automatically retried x amount of times until a total duration (timeout) has been used.
Ah, that makes sense, the logic for the retry was not in Should Have MQTT Messages
keyword itself, but used to configure how Robot Framework executes the keywords. Regardless, the rounding of the timestamp leading to inclusion of previous message is also something i suspected, and your examples show this will be an easy fix. Thanks!
Describe the bug
Running the test suite on my machine always fails.
To Reproduce
custom_sub_topics_tedge-mapper-az.robot
suite directly.Publish measurements to unsubscribed topic
) will fail.Expected behavior The second test should succeed.
Additional context
I've found following differences between report from the CI, and tests on my machine:
CI:
my machine:
For some reason,
mqtt-logger
invocation on my machine is repeated about 16 times, with a delay of 2 second between each retry. Incidentally, I found a 2 second delay in paramiko, the SSH library used for connecting to the docker test device:I wonder if for some reason I happen to be hitting this timeout.
Also found that if the keyword succeeds, then the command is executed only once. Looks like it's retried if the condition fails, but I can't find where the code for this logic is placed.