CCI-MOC / esi

Elastic Secure Infrastructure project
6 stars 12 forks source link

Serial console security issues #533

Closed larsks closed 1 month ago

larsks commented 2 months ago

When you enable serial console support on an ESI machine, Ironic begins listening on a tcp socket for incoming connections:

$ openstack baremetal node console show mynode -fjson
{
  "console_enabled": true,
  "console_info": {
    "type": "socat",
    "url": "tcp://129.10.5.143:8xxx"
  }
}

There are two significant problems with this implementation:

  1. These connections do not require any form of authentication. It gives remote users direct access to the system console, which often grants privileged access to the system. This would potentially allow someone to perform dictionary attacks against the root account, for example, which otherwise would not be possible.

  2. These connections are not encrypted. Someone on the same wifi network, access to a transit router, or with some other means of recording packets in the network connection will be able to reconstruct any data sent to the system console, including passwords.

These are both upstream (Ironic) issues, but they impact our ESI service offering and could lead to compromises in our environment.

tzumainn commented 2 months ago

I may need help with this one!

After some research, it looks like both methods Ironic uses to expose the serial-over-lan console - socat and shellinabox - are inherently "trusted"; if it's running, it's assumed that there's no need for authentication because, well, you started it.

So perhaps one solution would be to change the exposed IP from the public IP to the internal one, and set up some sort of proxy that supports authentication - maybe some self-generated password? or a OpenStack token?

@naved001 did you guys support serial console access in HIL/BMI? If so, what was the solution there?

naved001 commented 2 months ago

@tzumainn serial console was supported in HIL. We built a separate daemon to manage serial console access. The workflow was the you'd "enable" console access from HIL which would give you a token, and then that token was used to interact with the the serial console daemon. I think it was kind of like how nova provided VM consoles.

Since this was a web service we used SSL for encryption.

tzumainn commented 2 months ago

Hm - could we simply import that functionality into the ESI environment then?

naved001 commented 2 months ago

@tzumainn The way we did it in HIL or what nova does?

tzumainn commented 2 months ago

Oh, the way you guys did it in HIL, since that sounds like a more complete solution!

naved001 commented 2 months ago

@tzumainn There are a couple of problems actually:

  1. HIL's console support was read only; so one couldn't write to it at all. So we'd have to start by changing that. Additionally, the obmd repo was last updated 5 years ago.

  2. It feels disjointed. We'll need to register the IPMI information for each node in both, ironic and the obmd service. We'll obviously need to add a bunch of logic into ironic, maybe this could be driver in ironic. I don't know what it entails.

Maybe you should take a look at the documentation to be sure that this is the right solution; and then we can think about more details and propose what we want to do.

tzumainn commented 2 months ago

Oh, it would definitely be disjointed. I just don't see a better solution right now (although I am happy if someone comes up with an alternative).

I think the easiest would be some sort of proxy with authentication. Barring that, the only choice I see would be a complete alternative to what Ironic does - and at that point, I'd rather not integrate that into Ironic at all, since it's inflexible and pretty limiting.

larsks commented 2 months ago

I'm curious what the Ironic folks think about this issue. Is irc still the right place for getting in touch?

The quickest course of action would be to eschew socat support for shellinabox, which already has support for ssl encryption. This solves problem #2 and is available now, but doesn't address the question of authentication. We should probably just go ahead and make the necessary configuration changes.

I think the most expedient solution to the authentication issue would be using the authentication proxy you've suggested (sitting in front of shellinabox). The proxy should authenticate against keystone, just like any other service. It should support OIDC authentication for browser based access.

This means we lose command-line access to the console, but I think that's an okay tradeoff.

Figuring out how to do that in the context of Ironic's console support (e.g., https://github.com/openstack/ironic/blob/7d1bc77861a42dd36c458dd7cdf5db46357a1dec/ironic/drivers/modules/ipmitool.py#L1559) might be a challenge.


A more complete solution would probably involving hosting the serial console using a websocket proxy w/ keystone integration, and then putting a web ui in front of that for browser access. That would permit the use of command line clients as well (like https://github.com/vi/websocat).

tzumainn commented 2 months ago

I'm actually of the opinion that we may want to disassociate this from the main Ironic service. It's similar to the feeling I got waaaay back when we were considering integrating Keylime with Ironic. We eventually ditched that once the use-case was fully explained (that tenant users would want to bring their own Keylime), but if we did want centralized attestation, would we really want it buried beneath layers of the Ironic architecture? That feels limiting and not very agile; if an updated Keylime with a new API is released, Ironic can't handle the changes until code makes it upstream, and production deployments probably can't embrace those changes until they make it into a release. There's also the spaghetti code required to support multiple versions of Keylime. And if there are multiple attestation service options, the code becomes even more convoluted.

What would you think of creating some sort of ESI "plugin" model - a simple node endpoint in ESI-Leap that takes a node, an action, and a dictionary of parameters? Operators would install plugins, and users could call them with a generic CLI command - something like openstack esi node command <action> <parameters>. We could then create a console plugin that does whatever we want, and update it as we see fit whenever we want.

This solution would also take away my objections to making the cluster orchestration code server side (instead of client side). My issue there was that the cluster code felt bound to specific use cases; I wanted it more as an example rather than an implementation that we were committed to supporting for arbitrary future use cases. If operators can make their own narrow cluster orchestration plugins, that makes things easier for everyone.

larsks commented 2 months ago

What would you think of creating some sort of ESI "plugin" model...

I think a plugin model seems like a reasonable idea. I have some thoughts on the topic, but should probably save that discussion for another issue.

@naved001 what do you want to do in the short term in our existing ESI service? Disable socat, or some sort of vpn solution, or something else? Maybe we should move that discussion over to https://github.com/CCI-MOC/ops-issues.

joachimweyl commented 2 months ago

@tzumainn please add the Intern to this issue and break it down so it is easy for the intern to follow. Intern arriving may 20th

tzumainn commented 1 month ago

Closing this enormous issue in favor of three smaller issues