Open jcurtisk opened 2 years ago
This test only verifies that executing the --help
command does not cause the application to crash or hang.
Assigned to @o3de/sig-core per https://github.com/o3de/o3de/blob/development/.github/CODEOWNERS#L35
Same issue occurs with test_CLITool_PythonBindingsExample_Works. There does not appear to have been any movement in investigating why these tests are intermittently failing. Both of these tests should likely be disabled as it doesn't appear they're adding much value.
I just hit this issue again today 😖
[2022-11-25T11:14:48.732Z] ..\..\..\..\..\..\AutomatedTesting\Gem\PythonTests\smoke\test_CLITool_SerializeContextTools_Works.py::TestCLIToolSerializeContextToolsWorks::test_CLITool_SerializeContextTools_Works FAILED [ 75%]
See https://jenkins.build.o3de.org/blue/organizations/jenkins/O3DE/detail/PR-13383/1/pipeline/827
Log: log(2).txt
Any update on this issue @Kadino and/or @jckand-amzn? Can we please move this to Sandbox while it's investigated otherwise please. Thanks!
I think @o3de/sig-core should maintain any tests of SerializeContextTools. I support off-boarding this test on their behalf.
Anyone from @o3de/sig-core (@lumberyard-employee-dm?) able to remove this for now then if possible please?
Well I'll look into why it is failing.
All these test to is launch an application with the --help option, which only uses AZ_Printf
to output some usage test.
There is no reason it should fail
Running the SerializeContextTools.exe -help
command 10000 times through powershell succeeded without failure.
I ran with the help argument in both the debug and profile configuration.
PS C:\code\o3de\build\windows\bin\profile> for ($i=0; $i -lt 10000; ++$i) { ./SerializeContextTools.exe --help }
Based on my investigation, the issue with this test is within the Python and not the actual SerialzieContextTools
Also launching the application in the profile takes at most 230 milliseconds which is well below the 10 seconds timeout in the python script.
The failure that occurred on Jenkins is due to timeout, which points to the machine being overloaded with resource utilization that it took more than 10 seconds to launch the application.
[2022-11-25T11:14:48.732Z] ================================== FAILURES ===================================
[2022-11-25T11:14:48.732Z] _ TestCLIToolSerializeContextToolsWorks.test_CLITool_SerializeContextTools_Works _
[2022-11-25T11:14:48.732Z] D:\workspace\o3de\AutomatedTesting\Gem\PythonTests\smoke\test_CLITool_SerializeContextTools_Works.py:25: in test_CLITool_SerializeContextTools_Works
[2022-11-25T11:14:48.732Z] output = subprocess.run([file_path, "--help"], capture_output=True, timeout=10)
[2022-11-25T11:14:48.732Z] D:\workspace\o3de\python\runtime\python-3.10.5-rev1-windows\python\lib\subprocess.py:503: in run
[2022-11-25T11:14:48.732Z] stdout, stderr = process.communicate(input, timeout=timeout)
[2022-11-25T11:14:48.732Z] D:\workspace\o3de\python\runtime\python-3.10.5-rev1-windows\python\lib\subprocess.py:1152: in communicate
[2022-11-25T11:14:48.732Z] stdout, stderr = self._communicate(input, endtime, timeout)
[2022-11-25T11:14:48.732Z] D:\workspace\o3de\python\runtime\python-3.10.5-rev1-windows\python\lib\subprocess.py:1528: in _communicate
[2022-11-25T11:14:48.732Z] raise TimeoutExpired(self.args, orig_timeout)
[2022-11-25T11:14:48.732Z] E subprocess.TimeoutExpired: Command '['D:/workspace/o3de/build/windows/bin/profile\\SerializeContextTools', '--help']' timed out after 10 seconds
[2022-11-25T11:14:48.732Z] ============================== warnings summary ===============================
If a machine is loaded it can take like 100x longer to start a process. The record i've seen is 8 minutes from starting a thread to seeing the first instruction in it. It could be a deadlock though
It can also take especially long if the tool has to load gem dlls.
For test situations where there is a timeout but the test is expected to finish very quickly and rarely, if ever, time out unless something is wrong, making a short timeout can make the test flaky. So usually you want to grossly overestimate the timeout required (as long as the test finishes instantly when its normally working, ie, never actually wait for that full timeout unless somethings wrong)
Well in this case the test isn't loading any dlls. It is launching a main function that prints out the help usage text
FYI this test failed again recently here -
Log output can be found here - log.txt
Describe the bug Test failure on an AR run with unrelated changes.
AutomatedTesting::SmokeTest.smoke::TEST_RUN
failed on the test casetest_CLITool_SerializeContextTools_Works.py::TestCLIToolSerializeContextToolsWorks::test_CLITool_SerializeContextTools_Works
Failed Jenkins Job Information: [Failed pipeline] https://jenkins.build.o3de.org/job/O3DE/job/PR-11793/1/) Windows/test_cpu_profile failed.Attachments log.txt
Additional context This error has been reported before here