halmartin / meraki-builder

Scripts and tools to assemble firmware images for various Meraki MS switches and MX routers
GNU General Public License v3.0
69 stars 16 forks source link

Centralized layer-2 switch configuration #2

Open halmartin opened 4 years ago

halmartin commented 4 years ago

Currently, the switch boots to a default configuration of VLAN 1 untagged on every port.

While this is useful for the initial configuration, it has the potential to expose connected devices to incorrect VLANs and is generally not a configuration that people are likely to want from a managed switch in their home network. The more predictable behaviour would be that the initial boot of the switch has this configuration and subsequent boots use the user-defined configuration (basically how every other vendor of managed switches behaves).

Centralizing the layer 2 configuration of the switch and applying the user-defined configuration on boot would allow users to configure the switch to their individual needs starting from the second boot (or first, if you generate the JFFS2 filesystem yourself before flashing).

While maybe not very "unix-like" I was thinking of a single configuration file that configures each port, so that backing up and restoring the configuration is simple.

JSON may be a good choice as it could also make development of a web UI simpler as instead of having to provide an API on the switch to configure ports, the web UI could manipulate the configuration JSON in the browser, and then simply POST the modified configuration file back to the switch to save the changes. It would also simplify applying the configuration as no complicated shell (e.g. Cisco/HPE switch commands) needs to be implemented.

By abstracting the configuration away from Click syntax, users don't need to understand Click to write a configuration, and it allows us to migrate from Click later (e.g. if mainline support for vcoreiii is realized)

My initial concept for the configuration file would be to have something like the following in /etc/switch.json:

{
    "ports": {
        "1": {
            "pvid": 1,
            "vlans": [1, 10],
            "tagged": true,
            "stp": true,
            "lacp": false,
            "poe": {
                "enabled": true,
                "mode": "802.3af"
            }
        },
        "2": {
            "pvid": 1,
            "vlans": [10],
            "tagged": true,
            "stp": true,
            "lacp": false,
            "poe": {
                "enabled": true,
                "mode": "802.3af"
            }
        }
    }
}
mterron commented 3 years ago

I'd suggest the config file always have all the ports, that makes it easier to visualise what's going on. And similar to my comment in #3 an array looks like the natura way to model a switch.

Maybe something like this:

{
    "device": "MS220-8",
    "date": "2020-12-05T13:24:05Z",
    "ports": [
        {
            "port": 1,
            "link": {
                "established": true,
                "speed": 1000
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            },
            "poe": {
                "standard": "802.3at",
                "enabled": true,
                "power": 10
            }
        },
        {
            "port": 2,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            },
            "poe": {
                "standard": "802.3at",
                "enabled": true,
                "power": 0
            }
        },
        {
            "port": 3,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 4,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 5,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 6,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 7,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 8,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 9,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        },
        {
            "port": 10,
            "link": {
                "established": false,
                "speed": 0
            },
            "lacp": false,
            "enabled": true,
            "vlan": {
                "pvid": 1,
                "allowed": "1-4094"
            }
        }
    ],
    "lacp": [
        { "name": "lag_1",
            "mode": "active",
            "members": [1,2,5,7],
            "timeout": "slow"
        },
        { "name": "lag_2",
            "mode": "passive",
            "members": [9,10],
            "timeout": "fast"
        }
    ]
}
hall commented 2 years ago

I like the idea of storing everything in a single JSON file. I agree with @mterron, that it should always have all of the ports (perhaps generated by https://github.com/halmartin/meraki-builder/blob/master/buildroot/board/meraki/ms220/overlay/bin/switch_status on first boot).

However, after having played around with both a UI and daemon, I think the array of ports only makes things more complicated and ambiguous. For example, what does it mean if multiple array elements have the same port number? Using the port number as the object key "prevents" this because it's invalid JSON. It's also a bit easier to do something like (psuedocode) ports["5"] vs ports.find(.port == "5")[0] (moreso on the limited toolset available to the device). Happy to be convinced otherwise though.

halmartin commented 2 years ago

@hall since nothing currently depends on this, let's change the proposed schema if you feel it would be more elegant to use the port number as the object key.

mterron commented 2 years ago

100% agree. I just proposed the minimum possible modification and was NOT proposing that to be THE SCHEMA, sorry if it wasn't clear. Also I'd strongly suggest to separate status from configuration (I think I've said this in the past somewhere). By that I mean I'd like to be able to query the config and the status of ports as distinct entities.

hall commented 2 years ago

@mterron, what sort of querying do you have in mind? I don't think mixing them will be an issue when it comes to the UI but I could see an advantage to being clear about config vs status while directly editing the file. Something like the following?

{
  "config": {
    "ports": {
      "1": {
        "poe": {
          "enabled": true
        }
      }
    }
  },
  "status": {
    "device": "MS220-8",
    "ports": {
      "1": {
        "poe": {
          "power": 4.6
        }
      }
    }
  }
}

A bit more verbose but clearer about anything under the status field being read-only. Of course, depending on how you define "status", the config field is also your status (e.g., "poe is enabled") unless the daemon adds those keys under the status field after making that change (which is probably what we want :thinking: then we could do away with the caching approach and just diff config vs status to see what needs to change).

mterron commented 2 years ago

Not even that, I'll try to elaborate a bit.

Conceptually, configuration is something the operator/user can change. So the workflow for it is pull -> make changes -> push -> [apply/commit]. In API language, it's an endpoint you can GET and POST (PUT?) .

Status is a read only ephemeral property of the system that the operator can't alter programmatically. If there's link in a port, how much power a port is drawing, # of errors, etc. The operator can only change the status it by physically interacting with the switch (plugging a new cable, etc). It's a read-only GET endpoint.

For interaction I'm thinking of curl http://switch/config and curl -XPOST -d @config.json http://switch/config and curl http://switch/status. You probably want to query the status endpoint periodically or user SSE to push changes to the connected clients to reflect the switch status (ports change from green to red to black, blinking leds, etc). You will not be doing this to the config endpoint.

So in summary, I'm thinking of 2 distinct endpoints

config:

{
  "device": "MS220-8",
  "date": "2020-12-05T13:24:05Z",
  "ports": {
    "1": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      },
      "poe": {
        "standard": "802.3at",
        "enabled": true,
      }
    },
    "2": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      },
      "poe": {
        "standard": "802.3at",
        "enabled": true,
      }
    },
    "3": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      },
      "poe": {
        "standard": "802.3at",
        "enabled": true,
      }
    },
    "4": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    },
    "5": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    },
    "6": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    },
    "7": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    },
    "8": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    },
    "9": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    },
    "10": {
      "lacp": false,
      "enabled": true,
      "vlan": {
        "pvid": 1,
        "allowed": "1-4094"
      }
    }
  },
  "lacp": [
    { "name": "lag_1",
      "mode": "active",
      "members": [1,2,5,7],
      "timeout": "slow"
    },
    { "name": "lag_2",
      "mode": "passive",
      "members": [9,10],
      "timeout": "fast"
    }
  ]
}

status

{
  "switch": {
    "firmware_update_available": true,
    "fw_link": "https://freeraki.com/firmware/ms220-8/fw_1.1.bin",
    "temp": { 
      "system": 33,
      "psu": 36
    }
  },
  "ports": {
      "1": {
        "link": {
          "established": true,
          "speed": 1000
        },
        "poe": {
          "watts": 6
        }
      },
      "2": {
        "link": {
          "established": false,
          "speed": 0
        },
      },
      "3": {
        "link": {
          "established": true,
          "speed": 1000
        },
        "poe": {
          "watts": 10
        }
      },
      "4": {
        "link": {
          "established": true,
          "speed": 1000
        },
        "poe": {
          "watts": 0
        }
      },
      "5": {
        "link": {
          "established": false,
          "speed": 0
        },
      },
      "6": {
        "link": {
          "established": false,
          "speed": 0
        },
      },
      "7": {
        "link": {
          "established": false,
          "speed": 0
        },
      },
      "8": {
        "link": {
          "established": false,
          "speed": 0
        },
      },
      "9": {
        "link": {
          "established": false,
          "speed": 0
        },
      },
      "10": {
        "link": {
          "established": false,
          "speed": 0
        },
      }
    }
}

Hope that makes more sense.

EDIT: Add json examples and use the proposed port syntax

hall commented 2 years ago

Ah, OK -- you're thinking one level higher than I am. Specifically, the data on disk (what this issue is about) doesn't need to match the API request/response body (what probably belongs in issue #3).

However, I'm not sure that I agree on your statement about only querying that example /status endpoint to get the status; most all of the data in your example /config endpoint is status information (e.g., "is the port enabled?", "what's my pvid?"). I suppose you could say "all config (once applied) is status but not all status is config" :shrug: Let me play around a bit with your approach in mind.

For what it's worth, the meraki API, which I'm not familiar with, takes your suggested approach of having a separate GET endpoint for config and status.

mterron commented 2 years ago

Hi Hall,

I'm not sure I've ever seen anyone call configuration "status". I just can't wrap my head around this: "I suppose you could say "all config (once applied) is status". What does that mean? Config is never "status", it is the operator's defined settings for the system.

Port enabled is not status, PVIDs is not status, it is all configuration. Someone had to go and set it that way. I think the main distinction is ephemeral vs persistent maybe?

Also wrt to the disk representation, why would you persist the system temperature, or current port link speed, or port power consumption or a whole number of other ephemeral pieces of information that have no value/use except at the exact moment they are captured?

hall commented 2 years ago

Yeah, what you're saying makes sense. I'm thinking about it from the point of the UI or daemon. For example, if the daemon starts and the config file doesn't exist, it has to query the switch to determine whether a port is enabled. To me, that's asking the switch for a status but maybe it's more correct to think about it as reconstituting the configuration.

And for sure! I agree that we don't want to store the status. I think I was flipping through this and the UI management issue and conflated the two but that's certainly the right approach. It does bring to mind another question though: do we need 2-way sync for the config? For example, if someone bypasses the json file to update the config directly, should the daemon then update the json file?

mterron commented 2 years ago

From a client/ui perspective I expect the /config endpoint to tell me what the configuration currently is. How it came to be, if it comes from a file, or somewhere else it's an implementation detail that I don't think matters. I'd probably completely bypass the config file and always query the config directly (via the click thing? or whatever).

From the daemon perspective, I don't think spending a lot of time on this makes much sense as you are not going to deploy this in an adversarial environment.

Daemon lifecycle

daemon startup -> read config file and apply it -> start listening on '/config' & '/status'

At this point the file and the running config are one and the same. Some time later an admin might change the config without going through the standard tools so when a client (cli/web/whatever) request the config from the daemon, the daemon should query the system's running configuration (not the configuration file!) and send that to the client (or maybe sent both? Like this is the running config and this is the stored config). If the client post a config update, the daemon can apply it (maybe save and apply can be separate steps? That's what network equipment usually do). If you restart the switch, the config file is applied again.

hall commented 2 years ago

Yep, we're largely on the same page. The status portion is pretty straightforward and I think we have a decent idea about the config parts as well (would be nice if we could hook into the click stuff but I'm having a hard time even finding documentation on it). Let me flesh some more of this out, try to get a more complete end-to-end example running and we can make adjustments from there.

hall commented 2 years ago

lacp needs more info, correct? Best I can find, enabling link aggregation is done with

echo "AGGR 5, MEMBERS '1,3'" > /click/switch_port_table/add_link_aggr

If I'm understanding this correctly, that means we need some sort of aggregation ID and member list.

Easiest would probably be just to change the current port.<n>.lacp from a boolean to an int (of the AGGR ID).

{
  "ports": {
    "1": {
      "lacp": 5
    },
   "2": {
      "lacp": null
    },
    "3": {
      "lacp": 5
    }
}

but we could also do (if the aggregation ID isn't important)

{
  "ports": {},
  "lacp": [
    [1, 3]
  ],
}

Both of which mean ports 1 and 3 should be aggregated together and port 2 should have aggregation disabled (the default, if the key is missing or set to null). I've never used link aggregation so I could be entirely wrong here.

mterron commented 2 years ago

I've been playing around with my own implementation of a GUI today and found that having the ports in an array makes for nicer manipulation (using forEach) that adapts to any number of ports.

Take a look at: https://github.com/mterron/meraki-builder-ui

hall commented 2 years ago

No screenshot?! :stuck_out_tongue_closed_eyes: I like the idea of yours not having to be transpiled (though the preact framework gives us some niceties like updating status/config without reloading the page through data binding).

It's sh that makes things more difficult. There's jshn and jq but they're not nearly as nice as using javascript to manipulate JSON. Easier for javascript to do objects (since Object.keys(ports).forEach() will give the same loop. I've done that in the latest version of https://github.com/hall/freeraki-ui) than it is for sh to do arrays. Even in javascript though, the object notation is easier to access: ports["5"] vs ports.find(p => p.port == 5)[0] (or is it ports.find(p => p.port == 5).at(-1)?).

Either approach is definitely possible (neither is so difficult as to demand one over the other) but one is certainly less ambiguous. For example, what if multiple array elements have the same port number? To me, a port is identified by it's number, so it makes sense as an object keyed by that number, instead of an array where one of the keys must be unique within all items of the array.

In the end, I don't care enough to argue. If we want to say it's an array where the duplicate port number with the highest index takes precedence (which would mean that tools should avoid sorting the array as they could change which port meets that definition), that's what I'll codify in the docs (#20) so we can be consistent across tooling.

mterron commented 2 years ago

I'm not a JavaScript person by any means. It's the first time I've ever done anything like that, was just a challenge to see what I could put on a screen. Building the svg was heaps of fun though.

Can't comment about preact? I'm not familiar with that.

On Fri, 14 Jan 2022, 17:01 Bryton Hall, @.***> wrote:

No screenshot?! 😝 I like the idea of yours not having to be transpiled (though the preact framework gives us some niceties like updating status/config without reloading the page through data binding).

It's sh that makes things more difficult. There's jshn and jq but they're not nearly as nice as using javascript to manipulate JSON. Easier for javascript to do objects (since Object.keys(ports).forEach() will give the same loop. I've done that in the latest version of https://github.com/hall/freeraki-ui) than it is for sh to do arrays. Even in javascript though, the object notation is easier to access: ports["5"] vs ports.find(p => p.port == 5)[0] (or is it ports.find(p => p.port == 5).at(-1)?).

Either approach is definitely possible (neither is so difficult as to demand one over the other) but one is certainly less ambiguous. For example, what if multiple array elements have the same port number? To me, a port is identified by it's number, so it makes sense as an object keyed by that number, instead of an array where one of the keys must be unique within all items of the array.

In the end, I don't care enough to argue. If we want to say it's an array where the duplicate port number with the highest index takes precedence (which would mean that tools should avoid sorting the array as they could change which port meets that definition), that's what I'll codify in the docs (#20 https://github.com/halmartin/meraki-builder/pull/20) so we can be consistent across tooling.

— Reply to this email directly, view it on GitHub https://github.com/halmartin/meraki-builder/issues/2#issuecomment-1012730117, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABB2XWYQDLWP5JXTUC4VYHDUV6N25ANCNFSM4QPROT4A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

hall commented 2 years ago

No worries, I'm not an expert myself. It is fun trying to stitch things together though!

Have a look at the sh stuff as well, if you get a chance and you're more comfortable there, and let me know if you still feel strongly about using an array (either way, given the small scope of this project, it wouldn't be a huge deal to change down the line). And, in your defense, the upstream meraki API, yet again, takes your approach and uses an array (that just means I think you're both wrong! But I'm also willing to accept that I might be wrong :innocent: ): https://developer.cisco.com/meraki/api/#!get-device-switch-ports

rkboni commented 7 months ago

Having played with the UI currently, I realized that @mterron's comment in https://github.com/halmartin/meraki-builder/issues/2#issuecomment-1009531424 is dead-on -- the merging between config & status means that e.g. when you attempt to upload the current config your ports end up all locked to the current speed, which really fails the principle of least surprise; it also means that as-implemented at least in 0.4.0, the status values get lost by being over-written by the config values, so the status doesn't actually update as you plug-in/unplug devices or as e.g. the PoE power state changes (c.f. https://github.com/hall/postmerkos-ui/issues/1).

Looking at what one of my switches do, it's probably easy enough to simply split columns for status and configuration in the table and only use the "config" columns when generating the config to upload. E.g.:

Screen Shot 2024-02-05 at 11 43 20 PM