CCI-MOC / hil

Hardware Isolation Layer, formerly Hardware as a Service
Apache License 2.0
24 stars 54 forks source link

Need more hops to ensure EFI boot order is changed #853

Open henn opened 7 years ago

henn commented 7 years ago

In conversation with @vathpela from Red Hat, it sounds like when EFI is enabled the OS is able to modify the boot order.

This means that using the hard reset primitive from #835 won't be enough to wipe a malicious attacker's OS from the system, as the malicious OS could be waiting for the next boot setting to change, then change it back to what they want (like a disk with maliciousOS).

We probably need to do something like:

  1. Set the boot dev
  2. Do the hard reset
  3. Set the boot dev again. (Maybe sleep a second of two, since IPMI can have latencies?)

There's still a window between the each set_bootdev and the hard reset, but this may be the best we can do for now.

zenhack commented 7 years ago

Maybe instead of (2), do a hard shutdown, then add (4): turn machine on. This avoids some tricky race conditions. We could add a call to view the power status of the machine as well.

SahilTikale commented 7 years ago

@pjd agreed that this would solve the problem.

ianballou commented 7 years ago

I'll pick this one up since I worked on #835.

ianballou commented 7 years ago

Would it make sense to implement this inside of power_cycle()? If the user selects the force option it could go through the steps 1 - 4 above. Otherwise, there could be a new command to do this more secure reboot.

ianballou commented 7 years ago

I also noticed that we set the boot device to pxe in power_cycle, is there a specific reason for that? If someone runs set_bootdev and then power_cycle they might be disappointed to find that the node boot into PXE rather than what they chose.

zenhack commented 7 years ago

That could probably be removed now that we have set_bootdev. I think originally the notion was you'd have the default boot device set to disk, and rebooting via the api would let you netboot & install, after which you'd boot from the disk into your new OS. but it probably makes more sense to just call set_bootdev explicitly.

ianballou commented 7 years ago

@zenhack that makes more sense now in context.

ianballou commented 7 years ago

Adding to my previous comment, if we want to set the boot device and reboot, we might need a combined call that both sets the boot device and restarts the node. Otherwise, we won't know what to pass to set_bootdev when we run it both times.

zenhack commented 7 years ago

I think the right way to solve this issue is the following:

  1. Add an API call to check the current power status of the node, so we can tell when it's finished powering off
  2. Implement the node maintenance pool stuff.
  3. Have the maintenance daemon implement the cleanup logic (it will need the API call defined in (1) to do this correctly).

So HIL itself wouldn't do this, just provide the calls necessary to implement it externally.

ianballou commented 7 years ago

@zenhack is the maintenance daemon pointed to by the URL in the HIL config? (From #848) Also I plan on working on #848 before finishing this up.

zenhack commented 7 years ago

Yeah, that's what I'm talking about.