Remove new-year race condition from "baseline testing".

Requested Feature: (short description)

The baseline tests make innocent pull requests fail early each year. Kindly fix the race condition.

Related Area: (eg. tasks, compilers, configuration, templates…)

Tests.

Do you want to contribute this yourself as a pull request? (don’t worry about it if you don’t want to/can’t — someone else can take care of it)

[✔] Yes, I have already written code for it. See #3724 .
[ ] Yes, I don’t have code ready yet (that’s okay!)
[ ] No (that’s okay too!)

Does this feature affect backwards compatibility? If yes, in what way?

No, it does not. This concerns only tests.

Rationale and full description: (why should it be added to Nikola?)

There is a race condition. The precise circumstances to trigger the race condition are believed to be:

The pull request is entered early in the year.
The repo maintainers have not yet adjusted the baseline test to the new year.

You can see that race condition at work in one early-2024 test run, wrongly blaming pull request #3722.

This is not super-critical, as the baseline test is not required of a pull request.

Implementation idea

Integrate a filter that automatically adjusts the year number at that precise place where the diff (wrongly) finds it.

Prospect: More stabilization. (Not necessarily part of this feature request.)

Nikola is a Python project. The test could be ported from shell to Python. subprocess.run is our friend.

It is notoriously difficult to get shell script error handling correct. The baseline test is no exception. E.g., when run from another working directory different from what was intended, the scripts/getpyver.py already fails, but the script continues undeterred as if nothing was wrong. The filtering could be done as part of a port of the script to Python.

There are many cases where the invariant tests will fail due to reasons outside of our control. A new year is one of them, but there are also new versions of dependencies (we don’t pin versions) — and we can’t catch that as easily. The CI build is scheduled to run every Saturday, so we would be aware of a failure within a week at most. I’ve triggered a new build of the baseline site and restarted the tests of your pull requests.

Implementation-wise, doing it in bash scripts is simple — we just run wget, nikola, and diff. We used to run the baseline testing as an integration test within pytest, but there were issues with how the site is built in that context. Could the script be improved?

I don’t think a patching system is necessary. It’s easier to just re-build the demo site. If you want to try and implement this, feel free — but if this severely complicates the baseline setup, we might decide to reject this.

getnikola / nikola

Remove new-year race condition from "baseline testing". #3723