Skipped steps showing as failed in Report Portal UI

Rickdickulous commented 2 years ago

Describe the bug Steps that are skipped in Robot Framework test execution (IF/ELSE trees) sometimes show as failed in Report Portal UI.

Please let me know if more context is needed. I can't find any particular patterns to make it happen as it seems random. The same tests pass (in Report Portal) 95% of the time, but now and then at random we see this issue.

Below is a screenshot of the steps that are being run and the log output from Report Portal. All these steps passed in the Robot Framework log but are showing as fail in Report Portal. Let me know if you need part of the log or xml output or anything. I can't share full logs so please be specific and I'll get you what you need. Alternatively I can try to recreate the occurrence on sample tests that I can share. Thank you for your help.

bug_example

Steps to Reproduce Steps to reproduce the behavior:

Run a Robot Framework test with an IF statement that is not executed, with the Report Portal Robot Framework listener called to send results to Report Portal.
Review results in Report Portal and see the test is marked as failed, because the skipped step is marked as failed in the log view.

Expected behavior Tests that pass, even with skipped steps, are marked as passed.

Actual behavior Steps that are skipped are sometimes being marked as failed in Report Portal.

Package versions reportportal-client 5.7.0 robotframework-reportportal 5.2.1

HardNorth commented 2 years ago

@Rickdickulous Unfortunately my sample test case gives me correct result:

*** Settings ***
Documentation  Example of skipping falure startement by 'IF'

*** Variables ***
${SKIP}    True

*** Test Cases ***
Test If Skip
    Log  One passed step to check
    IF   not ${SKIP}
         Fail  This step should be skipped
    END

Screen Shot 2022-06-20 at 5 44 07 PM

I tried this case several times and the result is the same. Do you have any traceback in your logs?

Rickdickulous commented 2 years ago

No there is no trace in the logs other than what I showed. We run a lot of if statements in our executions and don't see this happen a lot as a percentage. I'll setup mine to run a sample test like this until it happens so I can send a full log.

Rickdickulous commented 2 years ago

@HardNorth I have another round of info for you.

I setup a Robot Framework test to run the if statement 1000 times per run. I ran this around 10 times and saw the bug 5 times.

Observations

I noticed that sometimes the results in Report Portal would first show correctly that the test passed, and then shortly after the result would change to a fail. I was able to capture this with screenshots once. These are screenshots from the same launch.
I was able to capture logs. Github wouldn't let me upload html or xml files. I converted them to .txt files so hopefully you can convert them back successfully. the log should be .html and output .xml.

log-20220620-131820 - Copy.txt output-20220620-131820 - Copy.txt

I was uploading the log file to Report Portal and thought Report Portal might be doing some post processing on the log, but even after disabling uploading the log file I still got the same results.

Rickdickulous commented 2 years ago

@HardNorth - Any update on this? I'm trying to understand if this is a Report Portal bug or a bug within the python agent. Thanks.

Edit: I've started digging into the agent source code. Any tips on where a bug in the agent might be or specific parts to look into would be appreciated.

HardNorth commented 2 years ago

@Rickdickulous Not yet, sorry, need to finish some stuff first.

Rickdickulous commented 2 years ago

@HardNorth - I did extensive testing on this and can't make any sense of it. I set up a local report portal server and some test code to reproduce the issue and was able to with debug logging enabled.

For this run the Report Portal UI shows this as the failure:

Where as the log shows everything worked fine:

Here's the log of this occurrence, you'll need to convert it back to html. Everything looks right from the report portal service and client side from what I can tell. Can you please confirm? Any suggestions on where to go/who to ask from here? Thanks again.

attachment_118 - Copy.txt

kpreety commented 2 years ago

I am also observing same issue .

HardNorth commented 2 years ago

@Rickdickulous @kpreety Could you try using the latest version of the agent (5.3.0)? The only difference between this step and others is log event happened on "token check" step. If the case is in logging then 5.3.0 should help, since there are changes exactly in logging.

Rickdickulous commented 2 years ago

@HardNorth - The version numbers confused me a bit but I think I tested the scenario you intended. Here is a screenshot from my pip list.

I am still able to create the skipped step marked as failed bug.

Here is the associated log file. attachment_164.txt

Please let me know how I can assist.

HardNorth commented 2 years ago

@Rickdickulous The latest version is 5.3.0 as I said:

HardNorth commented 2 years ago

@Rickdickulous And I'm seeing the a batch log request in iteration 958: Screen Shot 2022-07-08 at 3 57 44 PM

Rickdickulous commented 2 years ago

Ah glad I clarified. Here we go, latest versions of both. Tests are running now.

I did notice the same thing with the bug occurring in the step that had the batch log request before. The batch logs go out at regular intervals so wasn't sure if that was a lead or not.

I'll report back with test results later when I either see the bug again, or have run the scenario enough times to confidently say it's not occurring with the latest version of the agent.

Rickdickulous commented 2 years ago

@HardNorth - The bug is still present using the versions noted in my previous comment. Anecdotally it does seem to be happening less frequently. The rate of occurrence with the previous versions was 5/105,000, with this run being 1/100,000.

attachment_315 - Copy.txt

kpreety commented 1 year ago

yeah, with latest version also, issue is still exists

HardNorth commented 1 year ago

No idea what might be the cause, sorry.

Rickdickulous commented 1 year ago

@HardNorth - Do you have any indication of the scope of this issue? Is it specific to the Robot Framework agent, the python report portal client, or the report portal application as a whole? I can't find anything wrong with the Robot Framework agent so suspect it's something higher up in the chain. Is this type of issue seen anywhere outside of Robot Framework use case?

juhheikk commented 1 year ago

I am also observing same issue.

HardNorth commented 1 year ago

@juhheikk I still need any reliable steps to reproduce.

juhheikk commented 1 year ago

We have over 800 testcases in a single output.xml (with reruns for some of the testcases) executed with pabot. After uploading this .xml file to report portal there comes a single testcase with Fail status, looking more deep it is not a real failure but the skipped steps are causing the test step (kw) to be reported as failure. And this does not come always to the same test, it is random behaviour.

I tried to reproduce similar IF ENDs as in the above image, but I got just this, however there seems to be some "logic" error indicated by this one as well:

*** Test Cases ***
First
  [Documentation]  first testcase
  Log  Use builtin logger

#Second
#  [Documentation]  second testcase
#  Fail  utterly

Third
  [Documentation]  third testcase
  ${var_x}=  Set Variable  ${FALSE}
  IF  ${TRUE}
    Log  In true IF
    IF  ${var_x}
      Log  IN IF
    ELSE
      Testing
    END
  END

*** Keywords ***
Testing
  [Arguments]  ${no_test}=${FALSE}  ${test}=${TRUE}
  IF  ${no_test}
    ${status}=  Run Keyword And Return Status  Fail  utterfly
    Log  error
  END
  IF  ${no_test}
    ${status}=  Run Keyword And Return Status  Fail  utterfly
    Log  error
  END
  IF  ${no_test}
    ${status}=  Run Keyword And Return Status  Fail  utterfly
    Log  error
  END
  IF  ${no_test}
    ${status}=  Run Keyword And Return Status  Fail  utterfly
    Log  error
  END
  IF  ${no_test}
    ${status}=  Run Keyword And Return Status  Fail  utterfly
    Log  error
  END
  IF  ${no_test}
    ${status}=  Run Keyword And Return Status  Fail  utterfly
    Log  error
  END
  IF  ${test}
    ${status}=  Run Keyword And Return Status  Log  test
    Log  testing
  END

run with pabot and use post_report to send results

$ pabot test.robot
$ python3 post_report.py -v RP_ENDPOINT:http://localhost:8080 -v RP_UUID:xx-xx-xx-xx-xx -v RP_PROJECT:local_demo -v RP_LAUNCH:superadmin_TEST_EXAMPLE -v RP_ATTACH_LOG:True -v RP_SKIPPED_ISSUE:False ./output.xml

None of the "Fail utterly" are not even executed but those are shown as "Passed"

HardNorth commented 1 year ago

@juhheikk It was already said that it happens on big number of Test Cases, but that's not something I can use to fix the issue. I already made some synthetic tests without any luck.

And regarding PASSED status for keywords, this might be an issue which I can consider, but that's besides the point of the topic. If you want to start discussion about that, please file a separate issue.

juhheikk commented 1 year ago

Well I just thought that it could be closely related since it might not matter whether it is PASSED or FAILED that comes from the SKIPPED ones as there seems to be randomness involved. I can post the same info under another topic for you, no problem. Created another bug report -> https://github.com/reportportal/agent-Python-RobotFramework/issues/166

juhheikk commented 1 year ago

Found a way to reproduce this problem,

robot file:

*** Test Cases ***
First
  [Documentation]  first testcase
  Log  Use builtin logger

Third
  [Documentation]  third testcase
  ${var_x}=  Set Variable  ${FALSE}
  IF  ${TRUE}
    Log  In true IF
    IF  ${var_x}
      Log  IN IF
    ELSE
      Testing
    END
  END

*** Keywords ***
Testing
  [Documentation]  testing 123
  [Arguments]  ${se}=${EMPTY}  ${so}=${EMPTY}  ${sa}=${EMPTY}  ${sb}=${EMPTY}
  ${query_string}=  Set Variable  ${EMPTY}
  ${query_string}=  Set Variable If  "${se}" != "${EMPTY}"  aa  ${query_string}
  ${query_string}=  Set Variable If  "${so}" != "${EMPTY}"  bb  ${query_string}
  ${query_string}=  Set Variable If  "${sa}" != "${EMPTY}"  cc  ${query_string}
  ${query_string}=  Set Variable If  "${sb}" != "${EMPTY}"  dd  ${query_string}
  ${result}=  Get Stuff  ${query_string[:-1]}
  ${exit_status}=  Set Variable  ${FALSE}
  IF  "${se}" != "${EMPTY}"
    FOR  ${ROUND}  IN RANGE  60
      @{names}=  Set Variable  ${result['names']}
      Log  ${names}
      IF  ${names}
        FOR  ${NAME}  IN  @{name}
          ${exit_status}=  Run Keyword And Return Status  Should Contain  ${NAME}[name]  ${se}
          Exit For Loop If  ${exit_status}
        END
      END
      Exit For Loop If  ${exit_status}
      Sleep  0.5s
      ${result}=  Get Stuff  ${query_string[:-1]}
    END
  END
  [Return]  ${result}

Get Stuff
  [Documentation]  testing
  [Arguments]  ${dummy}
  ${name}=  Create Dictionary  name=test
  ${names}=  Create Dictionary  names=${name}
  ${dict}=  Create Dictionary  result=${names}
  [Return]  ${dict}

Run the test once with "robot test.robot"
Report with "python3 post_report.py -v RP_ENDPOINT:http://localhost:8080 -v RP_UUID:xx-xx-xx-xx-xx -v RP_PROJECT:local_demo -v RP_LAUNCH:superadmin_TEST_EXAMPLE -v RP_ATTACH_LOG:True -v RP_SKIPPED_ISSUE:False ./output.xml"
I had to repeat the reporting part aprx. 10-20 times by doing fast "up arrow" + "enter" combination

HardNorth commented 1 year ago

OK, let me take a look.

Noordsestern commented 1 year ago

We run in to the same:

We cannot reproduce it, but the pattern is this and happens at least once in each of our runs:

Some Keyword
    POST    ${some_url}     ${some_payload}
    IF    response.status_code >= 500
          Do Something
    END

I saw once that the IF statement was marked correctly as Skipped - but the POST was still marked as In Progress for a few seconds. Once it got marked as PASSED the IF switched its state to FAILED

We do use pabot to run Robot tests in parallel. But I would suppose ReportPortal can handle many messages.

HardNorth commented 1 year ago

@juhheikk Did it 30 times, no luck. Screenshot 2023-06-20 at 4 02 26 PM

HardNorth commented 1 year ago

@Noordsestern I would rather say this is pabot issue then, not our.

Noordsestern commented 1 year ago

@vignesh-thangaraj : how did we fix this?

Rickdickulous commented 1 year ago

@Noordsestern I would rather say this is pabot issue then, not our.

This issue is not related to pabot. I don't use pabot and observe this issue regularly.

HardNorth commented 1 year ago

@Noordsestern Well, right now I don't really know. But your case actually makes sense, we use just a simple list to track item structure: https://github.com/reportportal/agent-Python-RobotFramework/blob/develop/robotframework_reportportal/listener.py#L51

@Rickdickulous If pabot (or any plugin / your test code) finishes the same item twice, or use threads to update keywords (I do see usage of threads in its code) it might cause collisions in item tracking. For threads it's because lists are not actually thread-safe in Python and do not provide any atomic way to update them. The only reason why it not happens often is GIL, Python switches between threads every 5 milliseconds and threads are not actually running in parallel. But when one thread stops somewhere between actual list item removing, and another thread start running and update the next item it gets the same id from the list and, therefore, mess with statuses. More on that from Google: https://github.com/google/styleguide/blob/master/pyguide.md#218-threading

Well, this agent is not actually written to handle such situations. And again, there is no our fault here, we just work synchronously.

vignesh-thangaraj commented 1 year ago

@Noordsestern As a temp fix, our report-portal service will wait until the execution getting completed and trigger Slack notifications to our Slack channels about the execution meta. During the process, we pull all the failed test cases & their respective stack trace and mark the test cases/steps as passed if the report portal responds with an empty stack trace. Assumption is, empty stack trace failure does not make sense to consider as failure.

That works perfectly in our/this case.

HardNorth commented 12 months ago

Can someone test this with the agent version 5.4.0?

Rickdickulous commented 11 months ago

I tested this fairly thoroughly last week and did not see this issue where skipped Robot Framework steps are reported as failures in Report Portal. Each test included a loop of 1,000 skipped steps. I ran this test 18 times, for a total of 18,000 skipped steps reported, with no erroneous failures in Report Portal. The one difference is that I got a new laptop that I'm running the tests on.

I will upate the package version in our test pipeline and report back here if I see any further cases of this bug.

In the meantime I would consider this bug resolved as of version 5.4.0 as I am unable to reproduce it at this time.

HardNorth commented 11 months ago

Looks like it's fixed.

reportportal / agent-Python-RobotFramework

Skipped steps showing as failed in Report Portal UI #145

Observations