gusbrs / zref-clever

Clever LaTeX cross-references based on zref
LaTeX Project Public License v1.3c
11 stars 4 forks source link

CI tests run time with IoT docker images #20

Closed gusbrs closed 6 months ago

gusbrs commented 9 months ago

Following discussion at https://github.com/gusbrs/zref-clever/pull/15#issuecomment-1803661808

Since zref-clever moved to using the IoT TeX Live docker image for the CI tests GitHub workflow (https://github.com/gusbrs/zref-clever/commit/e1d2acfbc6c1b1321446cf3c5d813a27c889a6d2) my impression is that the workflow run was taking considerably more time to run than before. While I recalled workflow runs to take about 10-14 min, which is already considerable, with the use of the docker images, they were easily breaching the 20+ min mark.

It is true that workflows' run times tend to vary a lot, but at @muzimuzhi 's suggestion, I triggered workflows with IoT docker images and without it (using zauguin/install-texlive, as I used to before). Also true, this is just a one point sample for comparison, but better than nothing and better than my "impression".

Run with IoT docker image (https://github.com/gusbrs/zref-clever/actions/runs/6811932906): Initialize containers + Update TeX Live: 1m27s + 30s Run tests: 13m 52s Total time: 15m56s

Run with zauguin/install-texlive (https://github.com/gusbrs/zref-clever/actions/runs/6812269840): Install TeX Live: 4m24s Run tests: 8m48s Total time: 13m21s

So the overall difference seems to be smaller than granted by my previous impression. However, there really seems to be a difference in performance. I took more than twice as much to install TeX Live from scratch than to initialize the docker containers and update TeX Live. Despite that, the zauguin/install-texlive run was still faster overall, because the core "Run tests" task, which is exactly the same in both cases, and which is the most time consuming runs much faster in this case.

True, just 2.5min difference is hardly enough to revert to the previous approach, and to loose the convenience granted by the IoT docker image. But, as mentioned, this is just one point sample. I'll be keeping an eye on these run times, and if they remain being considerable, I may reconsider the use of the docker image.

In the meantime, if anyone has any ideas of why this performance difference occurs, and if there are ways to improve things, it'd be much appreciated.

Edit: These two workflow runs, from when the use of the IoT docker image was introduced, are from the same day and also comparable:

With IoT docker image (https://github.com/gusbrs/zref-clever/actions/runs/4212257093): 14m39s With zauguin/install-texlive (https://github.com/gusbrs/zref-clever/actions/runs/4212132743): 8m27s (But the logs are gone and we can no longer compare the parts).

muzimuzhi commented 9 months ago

Run with zauguin/install-texlive (https://github.com/gusbrs/zref-clever/actions/runs/6812269840): Install TeX Live: 4m24s Run tests: 8m48s Total time: 13m21s

zauguin/install-texlive has built-in cache support, hence in following runs the "Install TeX Live" step should take much shorter. See for example workflow run #6811059931 in latex3/latex2e.

A separate workflow can be created to update TeX Live cache regularly.

Perhaps off-topic, zref-clever could also split its 79+1 tests into multiple jobs run in parallel. Currently there're only non-ideal l3build options --first and --last.

BTW, action teatimeguest/setup-texlive-action supports installing historic TeX Live versions on all three OS. (Will tests remain passed in TeX Live 2022?)

gusbrs commented 9 months ago

@muzimuzhi Thank you for your thoughts!

zauguin/install-texlive has built-in cache support, hence in following runs the "Install TeX Live" step should take much shorter. See for example workflow run #6811059931 in latex3/latex2e.

A separate workflow can be created to update TeX Live cache regularly.

These would presumably reduce further the run time for the zauguin/install-texlive alternative, and thus increase the difference from the IoT docker image, right?

Btw, I did previously use (some) caching for that, and it was dropped in that PR of yours. Do you remember the reason why you dropped it?

Perhaps off-topic, zref-clever could also split its 79+1 tests into multiple jobs run in parallel. Currently there're only non-ideal l3build options --first and --last.

I also don't see how l3build could handle that. Furthermore, I actually do have some qualms about splitting the checks from the docs, as I currently do, because (I think) I end up doing a costly operation twice (downloading image/initializing container/installing TL, etc). It spares me some time, but I'm unsure if this is an undue burden on someone else's server (I'm also not sure whose server that is). Do you happen to know if the current state of things implies the download and container initialization is being done in duplicity (checks / doc)? And on whose server this is done? (In particular, is the IoT unnecessarily burdened by it?)

BTW, action teatimeguest/setup-texlive-action supports installing historic TeX Live versions on all three OS. (Will tests remain passed in TeX Live 2022?)

Here I'm not convinced this is something to worry about. At least given how packages are distributed in the ecosystem. Any regular user who has access to a current version of zref-clever has it from their TeX distribution (TeX Live, MikTeX, etc.), so they also have access to a current TeX Live version in the same place and by the same means.

I had a similar discussion some time ago with daleif, related to how to decide on the required kernel version when we want to use some new feature.

All in all, I fail to see the need to support and test for an "old TeX distribution with a current zref-clever" because this would only ever happen to someone who installs zref-clever manually. And such kind of users are the exception, and presumably know what they are doing. Other than that the versions of the TeX distribution and that of the package will naturally match, because that's how the community handles the distribution of things.

gusbrs commented 9 months ago

Mhm, I tested again zauguin/install-texlive, now re-enabling caching (thanks for that latex3/latex2e workflow link, btw) and got the run time down to 6m25s (https://github.com/gusbrs/zref-clever/actions/runs/6816911621). That's quite the difference now... I'd say this is a keeper, unless someone has some better insight on why it takes longer with the docker image and how to improve it. (I'll keep the issue open for some time to see if anyone chimes in).

muzimuzhi commented 9 months ago

Btw, I did previously use (some) caching for that, and it was dropped in that PR of yours. Do you remember the reason why you dropped it?

Starting with v2, zauguin/install-texlive auto restores and caches the texlive installation, see https://github.com/zauguin/install-texlive/commit/0c87efffa145ce68b6c7baf65f639dab0c4bd3ba. That's why I dropped the caching steps in https://github.com/gusbrs/zref-clever/commit/4173c238a797e03f471764b29324fc53055364af.

But GitHub only retains caches for 7 days, hence to make sure there're always a hit cache to restore, you may need another workflow running on a regular basis, triggered by on.schedule.

As you already applied, latex3/latex2e uses a different strategy since its large test suite is split to 33 jobs. Note the root, first job "Update TeX Live" (or "Install TeX Live" in https://github.com/gusbrs/zref-clever/actions/runs/6816911621) may still take the time to do a clean installation if no cache is hit. Therefore 6m25s is the best case, not the average case. You can emulate the worst case by manually deleting the caches listed in https://github.com/gusbrs/zref-clever/actions/caches and then triggering a rerun.

Do you happen to know if the current state of things implies the download and container initialization is being done in duplicity (checks / doc)? And on whose server this is done? (In particular, is the IoT unnecessarily burdened by it?)

My guess is yes, since "Regression tests" and "Documentation" jobs are run in parallel.

Seems it's on GitLab's server, see https://gitlab.com/islandoftex/images/texlive/container_registry. The GitLab Container Registry is free for everyone (all tiers, all offerings).

BTW, action teatimeguest/setup-texlive-action supports installing historic TeX Live versions on all three OS. (Will tests remain passed in TeX Live 2022?)

Here I'm not convinced this is something to worry about.

I meant, teatimeguest/setup-texlive-action makes it easier to test if regression test running in texlive 2023 is than texlive 2022. But with your latest data, it seems unnecessary.

gusbrs commented 9 months ago

Starting with v2, zauguin/install-texlive auto restores and caches the texlive installation, see zauguin/install-texlive@0c87eff. That's why I dropped the caching steps in 4173c23.

[...]

As you already applied, latex3/latex2e uses a different strategy since its large test suite is split to 33 jobs.

That's what I get for approaching the problem in trial-and-error fashion, instead of actually studying how things work. ;-) Let me save face and blame the "limited resources"...

Thanks for pointing that out. I might revert that commit then. Though handling the cache explicitly like that does have one benefit, which is the costly installation step is done only once for both checks and doc. Anyway, I'll test further.

But GitHub only retains caches for 7 days, hence to make sure there're always a hit cache to restore, you may need another workflow running on a regular basis, triggered by on.schedule.

[...] Note the root, first job "Update TeX Live" (or "Install TeX Live" in https://github.com/gusbrs/zref-clever/actions/runs/6816911621) may still take the time to do a clean installation if no cache is hit. Therefore 6m25s is the best case, not the average case. You can emulate the worst case by manually deleting the caches listed in https://github.com/gusbrs/zref-clever/actions/caches and then triggering a rerun.

Yes, I understand, of course, that that time depended on the cache being already there. But the "worst case" is still faster. And if I'm doing a batch of work in the repo I get the benefit of the cache on top of that. This seems like good enough, I'm not sure running a scheduled job just to keep the cache alive is granted for this (for my needs, that is).

I do think the use of the docker image is conceptually superior, it is arguably a more thorough and proper solution. The main practical benefit is not having to manually curate a list of packages to install, which is a pain, since the way to do it is through reiterated (failed) attempts to run the workflow on GitHub. But, once that's settled, it works too. And the current long CI run times have been bugging me.

Actually, another thing I'm considering is dropping the GitHub workflow altogether. When I added it, I did so because it seemed cool, and I wanted to learn about it. But, in hindsight, I've been getting little benefit from it. I tend to run the tests locally anyway. For releases, I must do so to prepare the package with l3build. The only difference here is that I get TeX Live versions of zref-vario and zref-check on GitHub and get my development versions locally. But that's really a further advantage of running locally, since occasionally there's a coordinated change between two of them, and the test fails on GitHub because of that. The other benefit of the workflow is to support a (very rare) PR. All in all, I'm not so sure if it's worth it.

My guess is yes, since "Regression tests" and "Documentation" jobs are run in parallel.

Seems it's on GitLab's server, see https://gitlab.com/islandoftex/images/texlive/container_registry. The GitLab Container Registry is free for everyone (all tiers, all offerings).

That's about what I suspected. Thanks for the explaination.

GitLab's servers than, makes sense. Better than burdening IoT's ones, but still.

I meant, teatimeguest/setup-texlive-action makes it easier to test if regression test running in texlive 2023 is than texlive 2022. But with your latest data, it seems unnecessary.

Then I'm afraid I had missed your point in this, and still do. Care to elaborate this a little further?

muzimuzhi commented 9 months ago

Perhaps off-topic, zref-clever could also split its 79+1 tests into multiple jobs run in parallel. Currently there're only non-ideal l3build options --first and --last.

I also don't see how l3build could handle that.

Sorry I missed this one in my last comment.

Currently all 79+1 tests were run in a single job named "Regression tests", but it don't have to. For the 79 tests in ./testfiles, one can use l3build check -c build to test them all, or can use

l3build check --last <name-of-nth-test>
l3build check --first <name-of-nth-test>

to check the tests in two steps, at the cost of running the nth test twice. Then the workflow can be set to start two parallel jobs for regression tests.

jobs:
  check:
    strategy:
      matrix:
        include:
          - id: 1
            command: |
              l3build check -c build --last zc-class-scrreprt01
          - id: 2
            command: |
              l3build check -c build -q --first zc-class-scrreprt01
              l3build check -c build-4runs -q

    name: Regression tests (${{ matrix.id }})
    runs-on: ubuntu-latest

    steps:
      - name: Install TeX Live
        uses: zauguin/install-texlive@v3
        with:
          packages: ${{ env.ZC_PACKAGE_LIST }}

      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Run tests
        run: ${{ matrix.command }}

      - name: Archive failed test output
        # ...

This provides another way to speed up l3build check. See also https://github.com/latex3/latex2e/issues/1073 and https://github.com/latex3/l3build/issues/314.

muzimuzhi commented 9 months ago

Actually, another thing I'm considering is dropping the GitHub workflow altogether. [...] The other benefit of the workflow is to support a (very rare) PR. All in all, I'm not so sure if it's worth it.

Another benefit: by setting a scheduled workflow run, it's easier to catch compatibility problems and uncover the need to update test files (.tlg) in time. It's most helpful when the base package is not in active development anymore, or its implementation and/or test suite is sensitive to changes in dependencies and/or packages used together. (Of course this can be done locally too.)

For example CTeX-org/ctex-kit is relatively stable and its test suite compares output of \showbox and/or \showoutput a lot, hence sensitive to kernel changes. Thus I set its CI to run weekly and since then it did help to catch failed checks in time and make each adjustments small and clear.

I meant, teatimeguest/setup-texlive-action makes it easier to test if regression test running in texlive 2023 is than texlive 2022. But with your latest data, it seems unnecessary.

Then I'm afraid I had missed your point in this, and still do. Care to elaborate this a little further?

It's already outdated/useless.

In https://github.com/gusbrs/zref-clever/pull/15#issuecomment-1803729498 I speculated that

The workflow runs in current repo became slower since ~7 months ago, which seems to be the time when TeX Live 2023 released.

To check against it, one way is to set the workflow to install TeX Live 2022 then...

gusbrs commented 9 months ago

Sorry I missed this one in my last comment.

Currently all 79+1 tests were run in a single job named "Regression tests", but it don't have to. For the 79 tests in ./testfiles, one can use l3build check -c build to test them all, or can use

l3build check --last <name-of-nth-test>
l3build check --first <name-of-nth-test>

Interesting, thank you very much once again. This technique would also apply to the IoT docker images. Nice, but I think it is well beyond my "GitHub-actions-fu" so I'd be weary to add this complexity to the task. Still I'm happy to have this alternative as a reference here.

Also, I reduced further the run times by dropping testing on dev formats. Now we're at ~4min cached / ~8min no cache. Much better, and acceptable.

Btw, TIL that my regression tests were taking as much time as those of the LaTeX kernel... Talk about self-taught rookies' insecurities and over-testing. ;-)

gusbrs commented 9 months ago

Another benefit: by setting a scheduled workflow run, it's easier to catch compatibility problems and uncover the need to update test files (.tlg) in time. It's most helpful when the base package is not in active development anymore, or its implementation and/or test suite is sensitive to changes in dependencies and/or packages used together. (Of course this can be done locally too.)

For example CTeX-org/ctex-kit is relatively stable and its test suite compares output of \showbox and/or \showoutput a lot, hence sensitive to kernel changes. Thus I set its CI to run weekly and since then it did help to catch failed checks in time and make each adjustments small and clear.

That's a good point. I had been doing this kind of check to catch upstream changes locally. But, as long as do keep the CI workflow, you have me convinced on running things on schedule too.

In #15 (comment) I speculated that

The workflow runs in current repo became slower since ~7 months ago, which seems to be the time when TeX Live 2023 released.

To check against it, one way is to set the workflow to install TeX Live 2022 then...

Ah! I finally understood what you meant. I had totally lost this train of thought. Though I think this would be a little too much for this check.

All in all, and again, thank you very very much for your comments! Insightful and useful as usual.

gusbrs commented 6 months ago

I think this is a fair time to close this one. I'll be glad to reopen it, though, if any other thoughts or ideas come up.