containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.37k stars 2.31k forks source link

pkg/machine/e2e: Remove unnecessary copy of machine image. #23068

Closed ashley-cui closed 1 week ago

ashley-cui commented 1 week ago

Stop copying the pre-pulled uncompressed machine disk into the individual test dir. The machine pull code already makes a copy of the disk into the test's HOMEDIR/.local/share/containers/podman/machine, and works off that copy.

Before the change: TESTDIR/<image> is copied to TESTDIR/podman_test/<image> by the test, and then podman machine copies the image to TESTDIR/podman_test/.local/share/containers/podman/machine/provider/<image>

After the change: TESTDIR/<image> is copied to TESTDIR/podman_test/.local/share/containers/podman/machine/provider/<image> by podman machine

The image that is actually run is at TESTDIR/podman_test/.local/share/containers/podman/machine/provider/<image> in both instances.

Does this PR introduce a user-facing change?

None
openshift-ci[bot] commented 1 week ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashley-cui

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/containers/podman/blob/main/OWNERS)~~ [ashley-cui] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
ashley-cui commented 1 week ago

Initial local run of one test file, but lets see the performance in the full CI run.

Before change: make localmachine FOCUS_FILE=basic_test.go 53.08s user 41.46s system 32% cpu 4:51.64 total

After change: make localmachine FOCUS_FILE=basic_test.go 36.01s user 19.68s system 28% cpu 3:17.05 total

Luap99 commented 1 week ago

Me wonders why bud ... tests are triggered on this PR. I would have assumed my new skip logic to skip bud here, i.e. on https://github.com/containers/podman/pull/23064 I see both machine and bud skipped so I assumed it works.

This PR touches machine so running machine test is correct but why would it trigger bud here?!

Luap99 commented 1 week ago

Comparing to https://cirrus-ci.com/build/5300635074035712 it doesn't look like it made a difference for linux. However macos was almost 60 mins there and 50 mins in your machine reset PR (https://github.com/containers/podman/pull/22941), here it is 34 mins so that looks like a very good speed up. I have not looked at macos timings much so I don't know the run to run variance but it looks like a good start.

edsantiago commented 1 week ago

This PR touches machine so running machine test is correct but why would it trigger bud here?!

Base is very very old, does not include your changes

Luap99 commented 1 week ago

This PR touches machine so running machine test is correct but why would it trigger bud here?!

Base is very very old, does not include your changes

Well then I have a bug report for you :) Your logformatter base commit is showing the wrong commit then, i.e. https://api.cirrus-ci.com/v1/artifact/task/5807831486562304/html/sys-remote-rawhide-root-host-sqlite.log.html lists 9ffac3317855 as base. Unless I am misunderstanding what "base" means. For me it would be last commit on the branch (PR) that is already in main.

ashley-cui commented 1 week ago

Rebased against main, and going to look at the test run from that. If the speed gains are lasting then I'll mark as ready for review :)

Luap99 commented 1 week ago

And now the skips are working, as this only touches machine test code no sys/int/bud tests triggered.

ashley-cui commented 1 week ago

Mac test is extremely flakey, but I can't think of a reason why this change would do that, going to dig a little .

edsantiago commented 1 week ago

Machine flake is #22551

ashley-cui commented 1 week ago

HyperV: 10 min speedup Linux: 1-5 min speedup Mac: 20-30 min speedup WSL: 1-5 min speedup

Looks like it sped up the Mac tests a lot, which makes sense, since those machines were still slightly io bound. Ready for review!

Luap99 commented 1 week ago

HyperV: 10 min speedup Linux: 1-5 min speedup Mac: 20-30 min speedup WSL: 1-5 min speedup

Looks like it sped up the Mac tests a lot, which makes sense, since those machines were still slightly io bound. Ready for review!

At the least for the other PRs I checked I do not see that much speed up on linux/wsl. There is a lot of variance in the run two run timings so it is hard to tell of course. Of course this is a good to start but we have to get macos/hyperv to around 20 min to achieve our 30 min overall CI time goal.

Anyhow LGTM of course

rhatdan commented 1 week ago

/lgtm