freebsd / kyua

Testing framework for infrastructure software
BSD 3-Clause "New" or "Revised" License
147 stars 42 forks source link

Add FreeBSD Jail execution environment support #224

Closed ihoro closed 5 days ago

ihoro commented 7 months ago

The main work and review was conducted here: https://reviews.freebsd.org/D42350. The respective mailing list discussion: https://lists.freebsd.org/archives/freebsd-hackers/2024-February/003032.html.

ihoro commented 7 months ago

I've prepared the simplest conditional build to fix Kyua build for non-FreeBSD: https://github.com/ihoro/kyua/commit/8b50c22f196a93ad1a69b1709cb77b5fdc31534e. Let me know if we would like to have a separate freebsd/ src dir for FreeBSD specifics.

It could be the subsequent PR after this one is merged.

I've tested it with FreeBSD 14 RELEASE and Fedora Server 39.

ihoro commented 7 months ago

There is the second candidate for the next PR: https://github.com/ihoro/kyua/commit/4dffd2650f9bc59d77af32275d8ab73573443322.

It's a bigger patch, but it collects FreeBSD specifics into a separate freebsd/ src dir.

ihoro commented 7 months ago

And there is another commit in the queue to become a PR: https://github.com/ihoro/kyua/commit/0d319defa0bf19460a1baa1be06edd0275833f12. It gets Kyua tests back to the pre-patch state: Test cases: 1472 total, 22 skipped, 0 expected failures, 0 broken, 2 failed

ihoro commented 5 months ago

The following topics are closed by current state of the patch:

ihoro commented 4 months ago

@ngie-eign, thank you very much for your time. Going to work on the outcome of the review.

ihoro commented 4 months ago

As much as I'm not a fan of the term "container" because it focuses on the Linux/Solaris technological implementations, I believe that it would be wise to use "container" in lieu of "jail". "jail"s are a concept that are very FreeBSD specific and don't exist outside of FreeBSD/FreeBSD derivatives. I realize this project is hosted in the freebsd GitHub org, but it's a good idea to keep things as generic as possible to avoid having to rearchitect things later if someone contributes an actual Linux or Solaris container implementation (for instance).

Sounds interesting to consider. I have concerns that two levels of abstraction could add extra misunderstanding from end user perspective and extra interface limitations for the possible future execution environments added. The execenv is already meant to be the first abstraction added. It's abstracted from the concepts and details of a specific execution environment, i.e. it does not imply a containerization only. And currently it's designed as a switch, it could be anything (just thinking out loud): execenv=host, execenv=jail, execenv=bhyve, execenv=docker, execenv=kubernetes, execenv=python-virtualenv, execenv=aws, etc. And the key abstraction is that parameters are not generalized, due to it can be hard to do and it may introduce limitations only. For example, jail has a single metadata property -- execenv.jail, what is a string of parameters passed to the jail(8) program. Docker could have multiple properties like execenv.docker.imageurl and execenv.docker.runparams. AWS could have more execenv.aws.* properties to cover authorization and details of the exact service to be used.

There was no intention to provide an automatic containerization feature which uses the local capabilities like jail for FreeBSD or cgroups et al. for Linux. It simply provides a new concept, execution environment, and implements the first non-default (aka host) environment -- FreeBSD jail based one. It's merely an outcome that it can be used to add extra isolation of some FreeBSD tests to improve parallelism of the test suite. A test author is expected to understand the background if jail execution environment is picked for a test.

This is a short elaboration of the current vision. What do you think? Probably I have not caught accurately the idea with execenv=container and execenv.container.params.

ihoro commented 4 months ago

Anyway, now I think that execenv.jail could be re-worked into execenv.jail.params to keep the "namespace" for possible future improvements of jail execenv with extra execenv.jail.* metadata.

ihoro commented 4 months ago

@ngie-eign, I'm leaving the "Resolve conversation" button for you. I guess it must be convenient for you.

ihoro commented 4 months ago

@ngie-eign, I think it's ready for the next round of your review.

ihoro commented 3 months ago

The latest push introduces the following changes:

ihoro commented 3 months ago

I've moved freebsd/* to os/freebsd/. There is no os namespace added, it can be done in future if it's really needed. The main task is to ease work with the history -- by putting the files at the desired location from the very beginning.

markjdb commented 3 months ago

I agree with Igor's view about the container/jail naming. There are lots of interface differences between jails, cgroups, zones, etc.., and it's premature to generalize from a single backend implementation. If in the future someone shows up with an equivalent backend for a different OS, it may have constraints which mean that a generic "container" engine will need backwards-incompatible changes to support both jails and the new container type. Meanwhile, this feature provides major benefits to FreeBSD src developers today - with very little effort we can use this to speed up network tests substantially.

As a test, I integrated this PR into FreeBSD and tried running the pf tests with and without execenv=jail. In bhyve with parallelism=4, it takes ~960s without this patch (most tests are serialized because of jail name clashes), and ~360s with the patch. All I had to do was add two lines to a makefile. I believe kp@ has seen similar improvements.

@ngie-eign is there anything I can do to help speed up review here? I spent a fair bit of time looking at early iterations of the patch and would quite like to see this feature progress into FreeBSD. Should we perhaps import it directly into FreeBSD and refine it there, then reconcile the two copies of kyua later?

markjdb commented 3 months ago

@ihoro I believe there is some kind of memory leak in the patch. So far I narrowed it down to something in the FreeBSD sbin tests. If I run them with -v parallelism=4, something causes kyua to consume a huge amount of memory, triggering the OOM killer. In particular, nothing there is using execenv=jail. Please let me know if I can help narrow it down further.

ihoro commented 3 months ago

@ihoro I believe there is some kind of memory leak in the patch. So far I narrowed it down to something in the FreeBSD sbin tests. If I run them with -v parallelism=4, something causes kyua to consume a huge amount of memory, triggering the OOM killer. In particular, nothing there is using execenv=jail. Please let me know if I can help narrow it down further.

Thank you for the extra attention and additional testing. I guess there is no kyua report --verbose to share respectively.

I've tried main on aarch64@qemu with kldload pf and have not re-produced it yet with the following result: Test cases: 637 total, 46 skipped, 0 expected failures, 1 broken, 1 failed Working on the amd64@qemu trial.

Any detail of the case's environment would be helpful. Let me guess that probably bricoler has been used here then probably I will have higher chances to reproduce the same if you provide me with the invocation details. It could be an interesting testing of many things at the same time.

markjdb commented 3 months ago

Any detail of the case's environment would be helpful. Let me guess that probably bricoler has been used here then probably I will have higher chances to reproduce the same if you provide me with the invocation details. It could be an interesting testing of many things at the same time.

Please try cloning bricoler, then run make. Then:

$ ./bricoler run freebsd-src-regression-suite-run --param freebsd-src:url=https://github.com/markjdb/freebsd --param freebsd-src:branch=main-kyua-parallel --param freebsd-src-regression-suite-run:hypervisor=bhyve --param freebsd-src-regression-suite-run:tests=sbin

Note you'll need sudo to use bhyve for now (I just configure bhyve and bhyvectl in sudoers for now on my build system). If you omit the hypervisor parameter, it'll use QEMU instead, which also works, but it's slower.

While tests are running, you can ssh into the VM with bricoler run freebsd-src-regression-suite-run ssh. If I do that, I see this in top:

 2045 root        122    0    14G  5226M CPU2     2   0:15 100.00% kyua -v parallelism=4 

There are some sharp corners here, please feel free to report issues or make suggestions. If you can't make progress, email me and we can debug further. Maybe I made a mistake somewhere.

markjdb commented 3 months ago

@ihoro I believe there is some kind of memory leak in the patch.

Hmm, there is something strange going on. It seems to be related to incremental buildworlds that I use by default in my test runner. If I do a clean build, I can't reproduce any problems, but I still don't understand the root cause.

I will keep digging, but please ignore this for now. Sorry for the noise.

markjdb commented 3 months ago

@ihoro I believe there is some kind of memory leak in the patch. I will keep digging, but please ignore this for now. Sorry for the noise.

I'm pretty sure there is a latent bug in kyua somewhere, related to -v parallelization. I've been running the FreeBSD src test suite in a loop with parallel tests, collecting bug fixes (mostly test bugs/hidden assumptions), and I've seen a couple of OOM kills. Again, sorry for the noise, it seems unrelated to this patch.

I would still really like to see this land soon. On my test systems, with -v parallelization=4, the full test suite takes 3-4 hours to run, which isn't terrible but is rather long. I expect that to drop significantly with this patch and a small amount of work.

ihoro commented 3 months ago

I'm pretty sure there is a latent bug in kyua somewhere, related to -v parallelization. I've been running the FreeBSD src test suite in a loop with parallel tests, collecting bug fixes (mostly test bugs/hidden assumptions), and I've seen a couple of OOM kills. Again, sorry for the noise, it seems unrelated to this patch.

Thanks for the additional analysis and the test runs.

Just thinking out loud: probably, we could add an extra logic to bricoler to capture and report such cases, with collecting of kyua core dump and related files if necessary (like kyua and kyua.debug binaries). It could provide us with a clue to build hypothesis of the lurking defect. Kyua is an important tool for the FreeBSD Test Suite, hence it should not be a bad thing to have some logic hard-coded specifically for the kyua test runs.

In addition to that, we could limit the memory of the kyua process with 200MB or something. Just to make it crash sooner without waste of resources and time. Just a quick idea, currently I do not have intuition of normal memory usage spikes kyua may have, I saw something steady like 11MB for the sbin tests.

ihoro commented 3 months ago

I would still really like to see this land soon. On my test systems, with -v parallelization=4, the full test suite takes 3-4 hours to run, which isn't terrible but is rather long. I expect that to drop significantly with this patch and a small amount of work.

From the organizational point of view, my current vision is that only the following topics need closure before the merge:

  1. The comments raised by @ngie-eign . I have a feeling that I covered most of them in a single iteration, and around 10% of them may require another iteration of discussion/change.
  2. The execenv.container idea discussion is not resolved yet. My intuition is that @ngie-eign could accept our vision, i.e. it would not require additional work.

And I propose to move the following topics to the subsequent PRs:

  1. syntax(N) version bump question. I have a proposal to mark such tests as failed/whatever. Having kyua process failure during Kyuafile parsing stage looks less practical, when the whole test suite is rendered unusable. There are no troubles expected from the FreeBSD project perspective, @kprovost and I had a discussion of "MFC dilemma" and it looks like the MFC of the latest Kyua version should cover it, i.e. we won't have to balance between old and new Kyua.
  2. New tests of the Kyua itself to cover the changes.
  3. Kyua version bump. To allow downstreams to sync respectively, e.g. homebrew formula's maintainer could provide the respective upgrade if there is the next official version GA. But I have a feeling of some debt collected so far like Linux support repairing need, etc. This is definitely a separate topic.
markjdb commented 3 months ago

I'm pretty sure there is a latent bug in kyua somewhere, related to -v parallelization. I've been running the FreeBSD src test suite in a loop with parallel tests, collecting bug fixes (mostly test bugs/hidden assumptions), and I've seen a couple of OOM kills. Again, sorry for the noise, it seems unrelated to this patch.

Thanks for the additional analysis and the test runs.

Just thinking out loud: probably, we could add an extra logic to bricoler to capture and report such cases, with collecting of kyua core dump and related files if necessary (like kyua and kyua.debug binaries). It could provide us with a clue to build hypothesis of the lurking defect. Kyua is an important tool for the FreeBSD Test Suite, hence it should not be a bad thing to have some logic hard-coded specifically for the kyua test runs.

Absolutely. In fact, I occasionally see kyua crash with one of several assertions on arm64, also not understood yet. :(

sbin/nvmecontrol/basic:io_passthru  ->  passed  [2.780s]                                                                                                                                                                                                                                                                      
sbin/mdconfig/mdconfig_test:attach_vnode_non_explicit_type  ->  passed  [3.087s]                                                                                                                                                                                                                                              
*** /usr/home/markj/bricoler/freebsd-src/src/contrib/kyua/utils/optional.ipp:175: Precondition check failed: _data != NULL                                                                                                                                                                                                    
*** Fatal signal 6 received

I plan to extend the freebsd-src-regression-suite job to automatically fetch kyua.core and open it in a debugger. Similarly, the VM console watcher will soon be able to automatically detect kernel panics and generate a report of that.

In addition to that, we could limit the memory of the kyua process with 200MB or something. Just to make it crash sooner without waste of resources and time. Just a quick idea, currently I do not have intuition of normal memory usage spikes kyua may have, I saw something steady like 11MB for the sbin tests.

From watching it in top(1), I think there is some specific race condition which triggers a quick blowup of kyua's memory consumption. That is, memory usage is usually below ~12MB, and at some point it quickly starts growing.

ihoro commented 1 month ago

@ngie-eign do you have a vision how we could proceed with the Pull Request? Currently it's aligned with the code committed to the freebsd-src/main. It looks less productive to continue with the related improvements due to this is not the only diff now between freebsd-src:contrib/kyua and kyua.

ngie-eign commented 5 days ago

Apologies for being so incommunicado. I'll merge this change now to sync freebsd:main and this repo. It would be good to have followup issues for any items pointed out that haven't been resolved in this PR. Thanks!

ihoro commented 4 days ago

Thanks for your time and volunteer effort. The subsequent PRs should be much smaller and easier to discuss.