nre-learning / nrelabs-curriculum

Learn next-generation skills for network engineers, all in your browser.
https://nrelabs.io
Apache License 2.0
140 stars 79 forks source link

gNMIc Lesson #339

Open Mierdin opened 4 years ago

Mierdin commented 4 years ago

Very cool new project called gNMIc, which offers a CLI for gNMI.

https://gnmic.kmrd.dev/

An NRE Labs lesson on this seems very feasible, and IMO @hellt should have right of first refusal for this.

@hellt any ideas for a simple topology that would be effective in helping to illustrate the capabilities and help folks get up to speed on the tool? Also, ideas on general topic areas that might go into an effective lesson outline?

hellt commented 4 years ago

Thanks @Mierdin thats a nice venue to try out gnmic, indeed. At a bare minimum a single network element would do, to explore the multi-target capabilities we should have 2 nodes. For a multi-vendor setting we might want to introduce multiple vendors.

As to the topology, the nodes can be completely isolated on the dataplane, as the networking aspects are not relevant for the gNMI protocol operations. The only common connectivity which is needed is the management network.

Mierdin commented 4 years ago

Agreed. We have both Junos and Cumulus currently, but I don't believe Cumulus supports GNMI out of the box unless we load up some kind of server there ourselves.

We are also of course always on the lookout for new images and I'd be happy to help with that if that's something you're interested in contributing.

hellt commented 4 years ago

then its fine to just have two vMXes connected to each other with a single interface to allow some pings to flow between them

How do I contribute this lesson, I am quite new to nre.labs so will gladly take any ref points.

Mierdin commented 4 years ago

Currently the only Junos flavor is vQFX, but I have been working on cRPD support and I'm hoping to have that available within the next few weeks. For this, I think cRPD would be a much better bet for what we're trying to do here.

Since you're new, I'd definitely start here. Some of that might be a bit boring, since you clearly know how to "github" but there's also some stuff specific to NRE Labs you might find useful.

I think a good first step is to build an endpoint image that has gnmic installed. If you want to take a crack at this for your first PR, feel free. Since its written in Go, the best bet is likely to do a multi-stage build of some kind so we can compile from source first, and then bring the binaries over to a simpler image. In case you haven't done this before, I do this to build antidote itself (which powers NRE Labs) if you are interested in a working example.

From there, we just need to run sshd so users can connect to a working terminal. You can take a look at our utility image for inspiration, but I would recommend just borrowing all the auth and ssh config stuff only, since we probably don't want/need all of the Python stuff from that image.

Let's see if we can tackle that first, and then hopefully once that's done, I'll be done with the work I described here and we can figure out content.

hellt commented 4 years ago

Thanks, I've read through the most of the getting started guides and I wonder if its really needed to create an endpoint image for gnmic.

What if I leverage a gen purpose utility image and will also teach learners how to leverage gnmic installer to download the latest and/or a specific version of it? I think that is useful as well, since that part is needed have they decided to install gnmic outside of the nre.labs environment.

Mierdin commented 4 years ago

As a policy, the platform doesn't allow connections outside of the lesson environment, which is why the documentation is oriented around everything being self-contained.

That said, if you would prefer the short route to doing this, you should be able to construct a simple Dockerfile that uses antidotelabs/utility as the base, and take whatever steps are needed to install gnmic, and you should be good to go.

hellt commented 4 years ago

ok, that fact escaped me. Please find the endpoint image PR here https://github.com/nre-learning/nrelabs-curriculum/pull/342

hellt commented 4 years ago

@Mierdin I saw a message that you will have a proper vacation soon, if there are any steps that I can preemptively take to create gnmic lesson before you go - you can count on me

Mierdin commented 4 years ago

Thanks for mentioning that - we won't be doing a new full release before I go, so don't worry too much about trying to cram this in the next few weeks. I've been working on new infra to be able to support the new images we'll need for lessons like this, but it's too much to try to get done right before I go away for a while, so I'd rather play it safe and get as close to the finish line as I can before I go, but wait until I get back to actually cross it :smile:

That said, I'd like to make sure folks like you are able to move forward in my absence. The preview service is currently running on the "old" cluster that's currently powering the main nrelabs.io site. In order to let you use it to preview your content in a PR, I'd need to get it running on the new cluster. I'm also spending today hunting down some pointers on gNMI with cRPD (it's a pretty new feature) to ensure it's a good target for this lesson. If we run into issues, I don't think adding a vMX image would be a problem, just would be a little extra work, so I'd like to try cRPD first and see if we can get away with that.

Regardless of all that, you are welcome to, at any time, open a PR for the new lesson content. You can use the antidote CLI tool to generate a skeleton lesson, or if you wish, you can use this lesson meta file I put together for some basic testing:

---
name: Telemetry At Your Fingertips with gNMIc
slug: gnmic-telemetry
category: tools
diagram: ""
video: ""
tier: prod
description: In this lesson, we'll explore the use of a tool called gNMIc to make sense of gNMI-based operations at the command-line.
shortDescription: gnmic
tags:
- telemetry

endpoints:

- name: junos1
  image: crpd
  additionalPorts: [51051]
  presentations:
  - name: cli
    port: 22
    type: ssh

- name: gnmic
  image: gnmic
  presentations:
  - name: cli
    port: 22
    type: ssh

stages:
- description: First Steps
  guideType: markdown
  stageVideo: ""

authors:
- name: Roman Dodin
  link: https://github.com/hellt

If you go that route, you'll want to ensure you still run the antidote validate <curriculum directory> command to make sure everything's valid. You're welcome to start this PR any time, but no guarantees on if the preview service will be meaningful to you until I swing it over and ensure it works on the new cluster. Provided I am able to get to it (hoping so), I will give you a heads up once that work is done. Until then you can keep pushing content to your PR the way you think it should work, and we can address any problems once previews are functioning on the new cluster.

Mierdin commented 4 years ago

FYI as I posted in https://discuss.nrelabs.io/t/new-kata-cluster-is-live-seeking-feedback/287/3, the preview service is now running on the new cluster and validated this with a quick temporary test using https://github.com/nre-learning/nrelabs-curriculum/pull/346

That said, I've only just started tinkering around with gNMI on the cRPD image (as mentioned it's really new) so not sure how much work is left to do there. I've confirmed the image version supports it, so I'm fairly confident it's a configuration issue (which can be provided as part of the lesson using the regular stage configuration methods). If I am able to get more info I'll post here.

hellt commented 4 years ago

Thanks! I guess if I create a PR with the lesson skeleton you pasted above I would be able to get it running on a testing cluster with some connectivity between the gnmic and crpd?

Mierdin commented 4 years ago

Yes, though you should use #346 as reference instead, there are a few other things that need to be done beyond the lesson metadata file - but that is the bulk of it.

hellt commented 4 years ago

Fair enough. I will dig in and see what I can do

On Sun, 2 Aug 2020, 19:10 Matt Oswalt, notifications@github.com wrote:

Yes, though you should use #346 https://github.com/nre-learning/nrelabs-curriculum/pull/346 as reference instead, there are a few other things that need to be done beyond the lesson metadata file - but that is the bulk of it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nre-learning/nrelabs-curriculum/issues/339#issuecomment-667700044, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLKV5JFCVCYJLVARUTYPFDR6WMYZANCNFSM4O5SKBBQ .

Mierdin commented 3 years ago

@hellt Just wanted to drop a quick update. I've been working on enabling builds for endpoint images within the CI pipeline, and believe I am ready for someone else to test it. This makes it so that you don't have to contribute an image first, and then the content separately, which is a silly constraint I've wanted to solve for a while, and finally got around to it. You should just need to open a PR with both the image and lesson changes needed, and the preview system will take it from there.

If you still have the time/interest, I think a gNMIc lesson is a great candidate for this. I also hunted down the configuration needed for cRPD to support gNMI. You'll want to modify the additionalPorts field to use port 50051, and then cRPD will need the following stanzas added:

set system services extension-service request-response grpc clear-text port 50051
set system services extension-service request-response grpc skip-authentication

The ability to auto-build these images is really recently added, so there will probably be wrinkles to iron out but I'm happy to help you through it if you're willing to be the guinea pig :) I haven't even announced it formally or documented it properly yet, but wanted to see if you'd be willing to put it through its paces first.

hellt commented 3 years ago

Hey @Mierdin yes, I think it will be possible to make a nice lesson out of it I would like to take a pause till first weeks of April, since I might have by that date another containerized open NOS to introduce to that lesson.

I think that pluralism in NOS selection will make it even more educating

Mierdin commented 3 years ago

@hellt No worries, sounds great! Totally on board with adding a new containerized NOS. Let me know if there are any base images/disks that need to be kept private, like we've done with cRPD; we'll add those to the private GCP storage bucket that our build pipeline has access to.

hellt commented 3 years ago

Hi @Mierdin It's been taking us longer that I'd expected, but finally it's getting all together.

In continuation of our multivendor gnmi lesson, where do I start to make SR Linux containerized NOS a citizen of nrelabs?

Mierdin commented 3 years ago

Woot! This makes me happy. And more good news is that since SR Linux is openly available, this makes the process that much easier. The contribution process is much the same as I mentioned further up. You'll want to start here: https://docs.nrelabs.io/creating-contributing/getting-started but in summary, here are the steps:

  1. Clone this repo and use the antidote tool to bootstrap a new lesson - this is just a skeleton so you'll need to add configs/content/etc but it's a good starting point so you can start seeing your previews in the PR you'll open. Feel free to stay minimal for now - this can always get re-done, and I think it would be more useful to make sure the sr linux image works well in NRE Labs first before spending a lot of time on lesson content, etc. So, a simple lesson which has a single SSH presentation to an SR linux endpoint with a single stage, and a mostly blank lesson guide should be fine.
  2. Add an image to the images/ directory for the new sr linux image. This will involve creating a new Dockerfile with some sensible configurations
  3. Commit your changes in a branch and open a PR. This will kick off some GH actions workflows that build your new image and start a temporary instance of NRE Labs you can use to preview what you have thus far.

Once you're able to do this, I should be able to guide you further. And if you have any questions at all, don't hesitate to ask.

hellt commented 3 years ago

Awesome. Sounds familiar. I somehow forgot how streamlined this is wrt to images. I think the only tricky part might be that srl container expects e1-X interfaces to be attached to it (in contrast to eth1+) But that can be potentially renamed before the entrypoint kicks in, unless this can be done in nrelabs

On Tue, 17 Aug 2021 at 00:58, Matt Oswalt @.***> wrote:

Woot! This makes me happy. And more good news is that since SR Linux is openly available, this makes the process that much easier. The contribution process is much the same as I mentioned further up. You'll want to start here: https://docs.nrelabs.io/creating-contributing/getting-started but in summary, here are the steps:

  1. Clone this repo and use the antidote tool to bootstrap a new lesson
    • this is just a skeleton so you'll need to add configs/content/etc but it's a good starting point so you can start seeing your previews in the PR you'll open. Feel free to stay minimal for now - this can always get re-done, and I think it would be more useful to make sure the sr linux image works well in NRE Labs first before spending a lot of time on lesson content, etc. So, a simple lesson which has a single SSH presentation to an SR linux endpoint with a single stage, and a mostly blank lesson guide should be fine.
  2. Add an image to the images/ directory for the new sr linux image. This will involve creating a new Dockerfile with some sensible configurations
  3. Commit your changes in a branch and open a PR. This will kick off some GH actions workflows that build your new image and start a temporary instance of NRE Labs you can use to preview what you have thus far.

Once you're able to do this, I should be able to guide you further. And if you have any questions at all, don't hesitate to ask.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nre-learning/nrelabs-curriculum/issues/339#issuecomment-899844671, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLKV5LVBCTWKWUT4QN6FUTT5GCYXANCNFSM4O5SKBBQ .

Mierdin commented 3 years ago

We use multus, which by default uses netX naming scheme. Looking at latest multus docs, it appears this became configurable at some point, which is good news but a) not sure if we're running a version that lets us do this and b) there would have to be platform modifications to expose this option and also to facilitate a multus upgrade if needed - they tend to break things between even minor versions. There is a networkInterfaces field in the image metadata file that I have intended to use for this purpose (currently unused) so I'm generally on board with the change if this is needed; would just take some time.

On the other hand, the image flavor untrusted runs a container in a Kata VM, which should give you full reign to rename interfaces as needed, so if this is possible for you to inject a script before the entrypoint (which other endpoints already do anyways, including crpd) that might be at least a quicker way to go.

Let me know what you think - my suggestion is for you to look into figuring how how hard it would be to make the image compatible with the existing paradigm of eth0, net0, net1, net2, etc, while I look into the scope of changes needed to make this more flexible.

Mierdin commented 3 years ago

@hellt Good news is that we're running a version of Multus that allows me to specify the interface name - just did a quick pod test on our cluster and it works great. Working on a patch to antidote-core now to finally make use of the networkInterfaces field in the image definition to expose this.

Quick question - is it still okay that eth0 is the first interface?

hellt commented 3 years ago

Oh, great news I almost started to draft this bash script in my head to do interfaces renaming that will be part of an entry point ಠ_ಠ

Yes, eth0 is totally cool with us. It will get renamed eventually, but it will happen automatically :)

On Wed, 18 Aug 2021 at 19:42, Matt Oswalt @.***> wrote:

@hellt https://github.com/hellt Good news is that we're running a version of Multus that allows me to specify the interface name - just did a quick pod test on our cluster and it works great. Working on a patch to antidote-core now to finally make use of the networkInterfaces field in the image definition to expose this.

Quick question - is it still okay that eth0 is the first interface?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nre-learning/nrelabs-curriculum/issues/339#issuecomment-901264790, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLKV5LIDY2FJQJSCHPES6DT5PPHXANCNFSM4O5SKBBQ .

Mierdin commented 3 years ago

Okay, the antidote-core PR is merged and I loaded that code into the preview system so it should be ready to use there. Let me know if you run into any issues.