CCI-MOC / hil

Hardware Isolation Layer, formerly Hardware as a Service
Apache License 2.0
24 stars 54 forks source link

WIP: Save to flash on brocade switch #964

Closed naved001 closed 6 years ago

naved001 commented 6 years ago
xuhang57 commented 6 years ago

Currently, I cannot offer a good review. Catching up with how switches have been used in hil generally.

naved001 commented 6 years ago

@zenhack hey Ian, let me know if you are okay with this approach before I spend more time ironing this. A more careful review can be done when you have more time on hand. Thanks!

zenhack commented 6 years ago

I'm okay with using the console interface to get the config if there's no way to do that via the http api. I'll do a proper review Monday sometime, sorry it's taken me a while to get to it.

zenhack commented 6 years ago

That is a bit much. I don't like the idea of not checking it though.

My worry with just launching a thread is that we have no backpressure then; theoretically requests could just keep piling up.

We could have one thread processing a queue:

https://docs.python.org/2/library/queue.html

..which we set a reasonable bound on, so we do have some backpressure, though there will be no delay in the common case.

But I also don't like the complexity of that; it seems like it would be hard to test. Additionally, I'm a bit bothered by the fact that saving to flash is currently best-effort; we should think through the implications of a failure + subsequent reset, and make sure we design this in a way that doesn't open us up to vulnerabilities.

naved001 commented 6 years ago

one fix to slightly alleviate this problem is to call save_config when we disconnect from the switch rather than every time we do a revert_port or modify_port, that way we only save after we perform a whole bunch of operations. I think we should do this in other switches too. Will think about your other concerns too.

naved001 commented 6 years ago

Ran this by @okrieg today.

He feels that we shouldn't be saving the state to switch at all if we can't always guarantee that. We don't know how to handle the scenario if the save fails.

So what he suggested is that, when HIL is deployed on a switch, the startup-config should have all VLANs disabled on HIL controlled switchports (I remember @zenhack suggested this somewhere) so in case of power failures we fail safe. In addition to that, we provide an admin only API that restores the switch to a state what HIL thinks it should be in (that had been discussed before too).

I think these changes would obviate the need for switches having to save at all. Let's mull over it a little before we decide what to move forward with.

@xuhang57 please add if I missed something during the meeting.

zenhack commented 6 years ago

Yeah, that definitely strikes me as more sane than doing a best effort thing.

naved001 commented 6 years ago

I am gonna close this then.