Open stweil opened 3 months ago
Here is a recent example: https://github.com/kitodo/kitodo-production/actions/runs/10005589215/job/27656601360#step:14:27222. See also the list of failing CI on master branch.
Issue #5378 describes a similar bug which was solved by ignoring the issue in the tests. This could be done here, too, but might not be the correct solution if the problem also exists in production. According to the fix #5380, the problem "is triggered when HTML elements are dynamically added and removed from the DOM (which happens often in Primefaces) such that the current reference to a specific HTML element is not valid any more". Will this happen only in tests?
Another error which also seems to occur randomly is this one:
[INFO] Running org.kitodo.selenium.CalendarST
[INFO ] 2024-07-29 13:00:02.840 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:02.840 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:02.868 [main] MetsService - Reading 2/meta.xml
[INFO ] 2024-07-29 13:00:02.893 [main] ProcessService - No metadata file for indexing: 3/meta.xml
[INFO ] 2024-07-29 13:00:02.894 [main] ProcessService - No metadata file for indexing: 3/meta.xml
[INFO ] 2024-07-29 13:00:02.992 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:02.992 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.026 [main] MetsService - Reading 2/meta.xml
[INFO ] 2024-07-29 13:00:03.063 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.064 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.089 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.090 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.244 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.244 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.273 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.273 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.303 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.304 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.334 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.334 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.364 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.365 [main] ProcessService - No metadata file for indexing: 1/meta.xml
[INFO ] 2024-07-29 13:00:03.413 [main] MetsService - Reading 2/meta.xml
[INFO ] 2024-07-29 13:00:03.460 [main] MetsService - Reading 2/meta.xml
[INFO ] 2024-07-29 13:00:03.502 [main] MetsService - Reading 2/meta.xml
Starting ChromeDriver 127.0.6533.72 (9755e24ca85aa18ffa16c743f660a3d914902775-refs/branch-heads/6533@{#1760}) on port 9813
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
Jul 29, 2024 1:00:04 PM org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Detected dialect: W3C
[INFO ] 2024-07-29 13:00:04.233 [main] ProcessService - No metadata file for indexing: 4/meta.xml
[INFO ] 2024-07-29 13:00:04.233 [main] ProcessService - No metadata file for indexing: 4/meta.xml
[INFO ] 2024-07-29 13:00:04.260 [main] MetsService - Reading 4/meta.xml
[INFO] [talledLocalContainer] Jul 29, 2024 1:00:06 PM javax.faces.validator.BeanValidator validate
Warning: [talledLocalContainer] WARNING: cannot validate component with empty value: j_id__md_1
Error: Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.142 s <<< FAILURE! - in org.kitodo.selenium.CalendarST
Error: createProcessFromCalendar Time elapsed: 13.637 s <<< FAILURE!
java.lang.AssertionError: Number of issues in the calendar does not match expected:<4> but was:<3>
at org.kitodo.selenium.CalendarST.createProcessFromCalendar(CalendarST.java:78)
@solth, I'd appreciate it if you could add a milestone to this issue, because it's really annoying when more than 50% of the CI tests fail because of this.
Does anyone have any idea how to fix this?
Issue #5378 describes a similar bug which was solved by ignoring the issue in the tests. This could be done here, too, but might not be the correct solution if the problem also exists in production. According to the fix #5380, the problem "is triggered when HTML elements are dynamically added and removed from the DOM (which happens often in Primefaces) such that the current reference to a specific HTML element is not valid any more". Will this happen only in tests?
There are multiple reasons for the StaleElementReference
that we encountered with the Selenium Tests in Kitodo.Production over the time. The DOM being adjusted dynamically by JSF or PrimeFaces is just one of them (albeit the most common one, I think). Other reasons can be that the browser in unable to reach a certain page because of an exception or error that occured in a previous test or during navigation on an earlier page. Sometimes the elements in a list, for example the options in a pulldown menu are unordered, even though the test expects them in a certain order, thus only succeeding when the list is coincidentally in the expected order.
In all these cases the browser ends up on a wrong page (sometimes the error page with stack trace), where certain expected elements were never present to begin with, thus causing the StaleElementReference
(in contrast to the identical components being replaced by JSF). Unfortunately, it's very difficult to find out which scenario is the cause for the failing tests in most cases, because the logs and stack trace will often times only be the result of a previous error, not the cause of the original error itself.
Since currently these failing tests are causing more trouble than they are worth, I would probably set them to @disabled
until enough resources are available to analyse the cause of the problem in depth. (as explained above, I think most of the reasons for the failing tests will only happen in tests, not in production, since they are related to certain "expectations" in the tests that are not met, so I think disabling these tests for now is acceptable as an exception this time)
Describe the bug The CI GitHub action does not pass reliably but fails sometimes with one or both of these errors:
As long as the reason for the random failures are unknown, we must assume that such random failures can also occur in production environments.
To Reproduce Steps to reproduce the behavior:
Expected behavior The CI action should not fail randomly.
Release Git master, observed there since at least several weeks now, also in oldest CI tests which are still online (= less than 90 days old).