Security-Onion-Solutions / securityonion

Security Onion is a free and open platform for threat hunting, enterprise security monitoring, and log management. It includes our own interfaces for alerting, dashboards, hunting, PCAP, detections, and case management. It also includes other tools such as osquery, CyberChef, Elasticsearch, Logstash, Kibana, Suricata, and Zeek.
https://securityonion.net
3.18k stars 494 forks source link

FIX: Reinstall on Ubuntu 18.04 fails on docker install #6467

Closed petiepooo closed 2 years ago

petiepooo commented 2 years ago

Trying to run 2.3.90 so-setup a second time on Ubuntu 18.04 fails trying to downgrade Docker version. sosetup.log shows repeated entries of the form:

Reading state information... python3-docker is already the newest version (2.5.1-1). docker-ce-cli is already the newest version (5:20.10.5\~3-0\~ubuntu-bionic). docker-ce-rootless-extras is already the newest version (5:20.10.5\~3-0\~ubuntu-bionic). Suggested packages: aufs-tools cgroupfs-mount | cgroup-lite The following packages will be DOWNGRADED: docker-ce 0 upgraded, 0 newly installed, 1 downgraded, 0 to remove and 0 not upgraded. (100) Command failed with exit code 100; will retry in 10 seconds ...

To work around the issue, I uninstalled docker via the following command (between attempts by so-setup to install it):

apt-get remove docker-ce docker-ce-cli docker-ce-rootless-extras containerd.io

A fix might be to add apt-get's "--allow-downgrades"option when attempting to install the docker packages at https://github.com/Security-Onion-Solutions/securityonion/blob/8990a09d921ae5e87237c0535965d03ded325dbe/setup/so-functions#L1212

This is related to #4995 where the issue was due to apt-mark holding the package. You may also want to include the "--allow-downgrades" option when installing salt packages, since its version is pinned as well and this might happen if there too is a similar version mismatch.

This is the dance one must perform when pinning packages to a specific version. I get why you do it, but still do not think it is wise.

I understand this may be difficult to duplicate on the dev branch, as it depends on the version available in the repo vs. the version pinned by salt. It is also handled very differently on CentOS and thus the official ISO. You may be able to catch this issue by manually testing an install+reinstall on Ubuntu using an older released version of SO or the current version right before you release an update.

TOoSmOotH commented 2 years ago

There are a few reason why we pin docker and salt. Primarily being the ability for users to update their system packages without worry of blowing up their grid. Docker, if updated on an auto update schedule, would cause all of the containers to restart and cause an outage. The other reason is both salt and docker have had updates that introduced bugs that caused issues with the grid. With the versions locked we can test extensively before we roll out an update. The main issue where packages are upgrading and then downgrading needs to be fixed on Ubuntu. This has already been addressed with centOS.

petiepooo commented 2 years ago

Hold: yes. Pin: no; I still don't like it. But i also do my own testing post-upgrades and know how to roll back if they cause issues. SecurityOnion is not unique in being dependent on stability of other packages. That's just my opinion, though, and I know it's counter to the goal of making SO easy for non-admin types to run. There are a few other places where "ease" of administration gets in the way of seasoned admins (cough-firewall)... but this is not the place for that discussion.

To return to the topic, I still see a failure during reinstall on Ubuntu 18.04, even when bypassing the docker downgrade issue. I think it's just seeing an "error" due to the influxdb python module patches already being applied to salt-common. I haven't tested it, but I believe running 'apt-get install --reinstall salt-common' would restore those to unpatched versions so setup won't error out when run a second time.

Please add reinstallation on Ubuntu Server as a test step to your release procedures. Issues like this make it obvious that test coverage of this scenario is lacking.

TOoSmOotH commented 2 years ago

As I mentioned in my previous comment, re-install is a known issue with Ubuntu and we are looking into fixing it which is why this issue is still open. The earliest we would address is in the .100 release. We used to not pin the docker versions and then this happened: https://github.com/docker/for-linux/issues/810 which pretty much broke all the things when docker auto updated. Our solution was to lock the version and test before we updated to the latest.

petiepooo commented 2 years ago

I think we're talking past each other here, Mike.

I remember that issue and the pain it caused. I agree that holding the update until it was resolved by Docker would have been easier. And auto-update should absolutely not be enabled for critical dependencies like that, which is what an apt-mark hold will do. I still disagree with pinning a specific version so updates to docker or salt are not possible at all without a SO2 version update. As I mentioned in my previous comment, Held: yes. Pinned: no. But that's just my opinion as a seasoned sysadmin. Why? I may want a salt update immediately (in dev first before prod) but not have time to update the rest of SO2. I have enough custom configs in place that updating SO2 is not as simple as running soup and walking away; it takes some effort and post-update patching for each deployed system.

I consider this a show-stopper. The ability to run so-setup a second time is a hard requirement for me. That it used to work, but as of 2.3.90 no longer does (on Ubuntu), is a pretty serious regression for my use case. I know this is open source and free, and I want to be clear that I do appreciate all the hard work you guys put in, but these things make me lean more toward rolling a custom solution, even if it's based on your images but managed by docker-compose rather than salt. While I like the tools present in SO2, the transition from SO16.04 to SO2 has been really tough for me.

But alas, this issue's comment stream still isn't the place to hash that out.

So back to the topic, again... I've looked through https://docs.securityonion.net/en/2.3/release-notes.html and don't see this listed as a known issue for 2.3.90. In fact the last release where major issues were spelled out as known issues was 2.3.50. Will that practice return, or should we be monitoring your Projects and Issues pages on github before each upgrade for show-stoppers like this?

dougburks commented 2 years ago

@petiepooo We appreciate constructive criticism. However, as I'm sure you're aware, this recent log4j issue is our top priority right now. We will take a look at this reinstall issue as time allows, but we have neither the time nor the energy right now to continue debating.

In the meantime, here are some options for you:

petiepooo commented 2 years ago

Part of this issue may be that salt is trying to patch the influxdb python module during every install and call to highstate. By my reckoning, that should be done once during install (and subsequent upgrades if needed), and should fail gracefully if it's already been patched, but should not cause errors on every call to highstate.

petiepooo commented 2 years ago

For reference, if someone else encounters this before the next minor release, running the following sequence of commands fixes this:

apt-get install --reinstall salt-common=3003+ds-1 salt-call state.hightstate systemctl restart salt-minion.service systemctl restart salt-master.service

Running apt-get with --reinstall restores the three patched files to their pre-patched state, then salt highstate attempts patching again. Once patched, salt needs to be restarted to pull the patched module in so the retention policy check doesn't fail during highstate.

It can be prevented on reinstall by editing so-functions to install salt-common along with salt-master and adding the --reinstall flag to the command so it overwrites the patched files during setup.

dougburks commented 2 years ago

Re-running Setup on Ubuntu 18.04 works properly in 2.3.100 now.