lf-edge / eden

Eden is where EVE and Adam get tried and tested:
https://projecteve.dev
Apache License 2.0
49 stars 47 forks source link

Thoughts on EVE developer's workflow and how eden fits into that #22

Open rvs opened 4 years ago

rvs commented 4 years ago

Hey @sadov @giggsoff @deitch @kalyan-nidumolu @eriknordmark I've started playing with Eden quite a bit this week and here's what emerged as my ideal developer workflow. Note that portions of it don't quite work with Eden yet, but if we all agree this is where we need to go -- we can then file individual issues to make it so.

For now, lets just use this issue to make sure we're all on the same page as to what that workflow should be.

The rest here is predicated on the following observations:

  1. EVE developers are focusing on hacking functionality that gets packaged via pkg/XXX
  2. ideally in my edit/compile/debug cycle I'd be able to simply update that one package in a running image of EVE someplace and continue with my testing -- I'm slowly working on making it a reality, but for now lets assume that EVE developers will always have to update a full image if they want to test their latest code in one of the packages
    1. regardless of how quickly we solve #2 in EVE, an average EVE developer doesn't really want to think about anything but rootfs they've just built -- all that business about installers, live images, etc -- that's deployment time. Most of my workflow is: 3.1. edit packages 3.2. compile rootfs 3.3. debug it in a live version of EVE

If we all agree with the above, then it follows that the ideal workflow is having a semi-permanent running instance of EVE that we will keep "upgrading" (until it breaks or gets wedged -- and then we restart it to state 0 and keep "upgrading" again).

Interestingly enough, this is exactly the kind of a workflow most of us who have access to commercial ZEDEDA's controller have been using for quite some time now. What is different now is that:

  1. we are trying to make the same workflow available to all the EVE developers
  2. we want to have a much, much finer grained control over the structure and the content of the config that EVE receives (speaking of which -- I would really love to see something like http://jsoneditoronline.org/ integrated directly into Adam, but it is fine for now)

So putting it all together, here's what my favorite workflow.

Before I begin my edit/compile debug cycle:

Now I have a pair of EVE+Adam running on my machine, I have all there whereabouts recorded in ~/.eden.state.yml and I do my daily routine. Which will consist of:

  1. running eden eve-update <rootfs.img> command which will force a config that will update EVE to rootfs.img
  2. editing eve-config.json and running eden reconf eve-config.json so that Adam picks up a new config
  3. running eden test XXXXX where XXXX is a test specification similar to go test -run XXX

Once I'm dong I anticipate to run eden shutdown (and yes I was tempted to suggest eden expel ;-)) so that the whole thing shuts down.

Does it make sense to the rest of you guys?

deitch commented 4 years ago

Hi Roman, quite a bit to unpack here, so it might take me a few runs.

First, I think we have three distinct cycles of development here, not one:

working on pkg stage 1

You are 100% correct, they just want quick cycles of working through write/build/debug until ready.

But for the majority of them, this cycle should not take place on a running EVE device; instead, it should be runnable "locally" wherever that is. So if I am working on pkg/foo, then I should be able to build and run pkg/foo with nothing else at all, not even a running eve. Only when I am ready, will I move to the next level.

Think of this like unit tests vs integration: I run unit tests automatically every time I make changes, often right there on my dev machine; I run integration tests when all of my unit stuff passes and I am ready for the next level, usually in some CI system.

The work for these people is not in eden at all, but in eve entirely. We need to make it much easier for them. eden may be the right place to document it, though.

working on pkg stage 2

Most of the time, this is someone who passed the previous level and is ready to run it on a "real" device (it could be virtual or physical); other times, it is someone working on a pkg that doesn't make sense except on a device (e.g. power-management or some special process that only works on an RPi4).

For these people, a semi-permanent eve+adam running, and controlled by eden, as you described above, makes perfect sense.

To make this cycle easier, we should consider very seriously adding the ability to upgrade individual pkg in a running eve without having to rebuild and upgrade rootfs; having to rebuild rootfs for every pkg change slows down the cycle. I agree that this should never be allowed in a real deployment, and we would need to engineer to prevent it, but if we want to be developer-friendly, we should make it easier for them.

working on eve

This is one of:

In truth, this is not that different from what you described: we need to make it easy for someone to do:

  1. work on my pkg (their work)
  2. run my tests (their work)
  3. run just my package change (stage 2)
  4. rebuild and deploy - this is like live-deploy or whatever the various CMSes like Hugo and such do, when you change a page, and it detects and rebuilds it and redeploys it for you.

So I could see a process like:

This rolls together your eden eve-update and make live etc. steps into one. The goal is to be able to control everything from within eden. Sure, I could do each step separately, but the goal of eden is to make that unnecessary, unless I truly want to.

For some of the other points:

have a default .eden.yml checked into the EVE workspace - that would be a configuration file describing default way of deploying EVE and Adam

I hope you mean "into the eden workspace"? If eden can deploy eve and adam, and eve has config for eden in its workspace, you end up with loops and circular logic, and it becomes difficult. Let the eve repo stand on its own, let the adam repo stand on its own, and eden can know about both.

What would be in this eden.yml? What are the options?

~/.eden/config.yml

So local eden.yml is my local one, and ~/.eden/config.yml is my defaults? So if I run eden <cmd>, and that command is configurable, it first looks in ~/.eden/config.yml. If it finds nothing there, it uses built-in defaults. I then can override both of those with some <somedir>/eden.yml.

I am not sure having the local eden.yml actually adds all that much. I don't want it in the EVE repo, since I would be creating circular logic.

Maybe if I knew what the options were and some examples, it might help make this more concrete.

have a ~/.eden/state.yml

I think that is fine. I am not completely sure why we need to keep track of running pairs. I think your logic is that adam and eve are processes or even real devices, and so my eden command could exit and they would still be running; this lets me pick it up from rerunning the command or in another terminal, etc.

pairs of EVE and Adam

They don't actually need to be pairs; adam happily will serve multiple devices.

run eden shutdown

For both eden run and eden stop (or shutdown; I like stop, but do not care enough), we should have named devices, the way k3d/kind do named clusters, and we can have a default naming scheme, so if you run eden run --name abc it creates a pair abc. If you run eden run it creates a pair named eden-01 and then eden-02 etc.

The above leads to an interesting semantic question. We really don't have to have multiple pairs, just multiple eve instances. So we might want to change the semantic to:

We should think a bit about semantics for multiple adam devices on the CLI. I was thinking something like:

If I want to attach a specific eve to a specific adam, I can do:

kalyan-nidumolu commented 4 years ago

Hi Roman,

The flow looks pretty good.. I like u r idea of json editor :-)

1) Eden always picks image from local workspace? Or is it capable of downloading them?

2) Can they point to other data stores?

3) I assume I always start Eden in my EVE workspace? Or can give it a location to find the images?

4) "run eden run which spawns a default version of EVE+Adam (lets say we always pick latest releases -- it doesn't matter even for EVE since it'll be updated almost immediately)" --> Why not always start off with the local image as default?

Thanks, Kalyan

On Thu, Apr 16, 2020 at 11:49 PM Roman V Shaposhnik < notifications@github.com> wrote:

Hey @sadov https://github.com/sadov @giggsoff https://github.com/giggsoff @deitch https://github.com/deitch @kalyan-nidumolu https://github.com/kalyan-nidumolu @eriknordmark https://github.com/eriknordmark I've started playing with Eden quite a bit this week and here's what emerged as my ideal developer workflow. Note that portions of it don't quite work with Eden yet, but if we all agree this is where we need to go -- we can then file individual issues to make it so.

For now, lets just use this issue to make sure we're all on the same page as to what that workflow should be.

The rest here is predicated on the following observations:

  1. EVE developers are focusing on hacking functionality that gets packaged via pkg/XXX
  2. ideally in my edit/compile/debug cycle I'd be able to simply update that one package in a running image of EVE someplace and continue with my testing -- I'm slowly working on making it a reality, but for now lets assume that EVE developers will always have to update a full image if they want to test their latest code in one of the packages
  3. regardless of how quickly we solve #2 https://github.com/lf-edge/eden/pull/2 in EVE, an average EVE developer doesn't really want to think about anything but rootfs they've just built -- all that business about installers, live images, etc -- that's deployment time. Most of my workflow is: 3.1. edit packages 3.2. compile rootfs 3.3. debug it in a live version of EVE

If we all agree with the above, then it follows that the ideal workflow is having a semi-permanent running instance of EVE that we will keep "upgrading" (until it breaks or gets wedged -- and then we restart it to state 0 and keep "upgrading" again).

Interestingly enough, this is exactly the kind of a workflow most of us who have access to commercial ZEDEDA's controller have been using for quite some time now. What is different now is that:

  1. we are trying to make the same workflow available to all the EVE developers
  2. we want to have a much, much finer grained control over the structure and the content of the config that EVE receives (speaking of which -- I would really love to see something like http://jsoneditoronline.org/ integrated directly into Adam, but it is fine for now)

So putting it all together, here's what my favorite workflow.

Before I begin my edit/compile debug cycle:

  • have a default .eden.yml checked into the EVE workspace - that would be a configuration file describing default way of deploying EVE and adam. You can either edit it or may be we can have an option of having ~/.eden/config.yml that would always serve as an override. This file, but will allow me to tweak all the same value that today are in that make's Config.mk
  • have a ~/.eden/state.yml that keeps track of pairs of EVE and Adam running on this machine (I think it will be fair to assume for now that most of the time it'll be a single pair)
  • run eden run which spawns a default version of EVE+Adam (lets say we always pick latest releases -- it doesn't matter even for EVE since it'll be updated almost immediately)

Now I have a pair of EVE+Adam running on my machine, I have all there whereabouts recorded in ~/.eden.state.yml and I do my daily routine. Which will consist of:

  1. running eden eve-update command which will force a config that will update EVE to rootfs.img
  2. editing eve-config.json and running eden reconf eve-config.json so that Adam picks up a new config
  3. running eden test XXXXX where XXXX is a test specification similar to go test -run XXX

Once I'm dong I anticipate to run eden shutdown (and yes I was tempted to suggest eden expel ;-)) so that the whole thing shuts down.

Does it make sense to the rest of you guys?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lf-edge/eden/issues/22, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK4GJOOZCREF3QMNSYKKDJTRM73ZPANCNFSM4MKP3IUA .

deitch commented 4 years ago

Oh yes, jsoneditor. I am not sure how I feel about it. Is it better than having it on the filesystem, and having someone edit it, and adam watch the file?

sadov commented 4 years ago

So colleagues, thank you very much for your comments -- we are not very familiar with your (and your partners) current development workflows and any information about them and future development vision is very helpful for us. And now I will try to describe my current understanding and vision.

At first -- more simple questions from @kalyan-nidumolu . Yes -- we developing now new version of eden runtime system wich oriented to using of prebuilded images from docker hub. As a wish, it might be nice to think about dividing existing images into subcomponents to reduce the size of images for download.

About optimizing our test suite for the various needs of developers. As I understand, it's need more precise machinery for manipulation of different components of harness. And the current flat hierarchy of commands seems too rigid. Actually we have a lot of components: eve, controller (adam in our case), local image server and applications running on eve. For them we have more or less same basic actions set: start, stop, status, and some additional actions like manipulating by different kinds configs, deployment strategies, lists of components and may be some rebuilding/reconfigure of components. For such system seems more reasonable components based hierarchy as described by @deitch -- with components on upper level.

But if we speaking about more complex solutions accordingly by @deitch 's examples it may need another implementation of test harness with possibility of full modeling of whole complex including network topology between different instances of components. The good example of such system is a http://mininet.org modeling tool/framework developed in Stanford.

About usability optimizing for developers. I absolutely agree with @deitch -- it’s important to leave them the opportunity to use the tools they’re used to. But I agree with @rvs too -- for lowering of entry barrier during the quick start on new topic such web-oriented environments may be very useful. Especially for education of new developers (mostly in third-party Apps dev-t and HW porting). And it would be reasonably to think about such direction of future development. Such good Through The Web full cycle development/building/testing/modeling system may be in demand on the market.

BTW -- in some cases our tests finished unsuccessfully just because unpredictable rebooting of EVE on QEMU. May be it would be reasonably to think about pre/post hooks functions (or may be chains of such functions) around standard tests which may be enabled/disabled just by configuration without rebuilding. In our case such hook may monitor such rebooting and rerun the test with some reporting about reboot reason. May be such feature may be helpful in other use-cases. What do you think about it, colleagues?

rvs commented 4 years ago

Hey @deitch a few comments on your comments ;-)

working on pkg stage 1

agree that this is very much "a thing", but lets also agree to keep this part of the workflow out of scope for what we're discussing on this particular issue, ok? Just don't want to overload getting to a point where we can have a functional eden CLI-driven workflow

working on pkg stage 2

Agreed on both of your points, but for now -- until upgrade of individual packages is possible -- we're still sticking with rootfs update as a sort of "workaround" for enabling this part of the workflow. Sounds good?

working on eve

As previously stated, I'm lumping "pkg stage 2" workflow into this for now.

I run eden deploy and it rebuilds the image

It is cool if it does that, I suppose, but I'm actually fine rebuilding it as an independent step and just point eden deploy at a rootfs. So priority wise for @sadov and @giggsoff -- I think it is fine if at first eden expects a live image and then we can gradually teach it the rest of the workflow.

I hope you mean "into the eden workspace"?

Yeah -- sorry for the typo (or was it a thynko? ;-))

What would be in this eden.yml? What are the options?

All the options that are currently here + qemu config (either embedded or as a pointer to a file) https://github.com/lf-edge/eden/blob/master/Makefile#L105

They don't actually need to be pairs; adam happily will serve multiple devices.

That's actually a good point. So maybe for now @sadov @giggsoff we can assume 1xM model with a single Adam and (potentially) multiple EVEs that it controls.

For both eden run and eden stop (or shutdown; I like stop, but do not care enough), we should have named devices, the way k3d/kind do named clusters, and we can have a default naming scheme, so if you run eden run --name abc it creates a pair abc. If you run eden run it creates a pair named eden-01 and then eden-02 etc.

I really like this idea -- @giggsoff @sadov -- can you guys incorporate this into what you're building?

Oh yes, jsoneditor. I am not sure how I feel about it. Is it better than having it on the filesystem, and having someone edit it, and adam watch the file?

For as long as Adam is running on your laptop -- I think I agree with you @deitch the usefulness of jsoneditor is not that high. However, I had this crazy idea of building adam as an EVE package (yup -- no mistake -- you've heard me right). Then when EVE starts it actually connects to a local instance of Adam (running on the very same device EVE is running) and it connects over localhost. Then you can actually navigate to Adam's URL and control Adam/EVE pair as a fully integrated unit.

rvs commented 4 years ago

Hey @kalyan-nidumolu -- let me answer a few questions for you:

The flow looks pretty good.. I like u r idea of json editor :-)

Yeah -- especially the diffing part is pretty cool (diff an previous config with a new config)

1) Eden always picks image from local workspace? Or is it capable of downloading them?

Great question: for the MVP I think it always picks rootfs from local filesystem. However I also asked @giggsoff and @sadov to look into being able to fetch lfedge/eve artifacts, unpack them and get the binaries that way.

I think it is actually inevitable, since the first phase of EVE that connects to Adam will always be the one from a previous stable release (it will be immediately upgraded to the rootfs I'm working on, but it gets deployed as lfedge/eve:latest right before that). So making Adam be able to fetch that lfedge/eve:latest is pretty much a requirement -- since I'm really not interested in building it locally.

Btw @giggsoff @sadov -- feel free to suggest if we need to restructure that lfedge/eve:latest package somehow to make it more convenient for what you're building.

2) Can they point to other data stores?

Absolutely -- that's just part of the config that you'll be giving to EVE.

3) I assume I always start Eden in my EVE workspace? Or can give it a location to find the images?

The idea is that you can always point at a random rootfs image anywhere.

4) "run eden run which spawns a default version of EVE+Adam (lets say we always pick latest releases -- it doesn't matter even for EVE since it'll be updated almost immediately)" --> Why not always start off with the local image as default?

We could, but honestly, the amount of time it takes to upgrade from lfedge/eve:latest to your own rootfs is pretty much the same amount of time it would take you to convert your rootfs into a live image.

But to be clear -- this lfedge/eve:latest is just the default -- if you want to point eden run at your own live image -- that's fine too.

rvs commented 4 years ago

Hey @sadov @giggsoff -- hopefully this discussion helps solidify your thinking about this. Now, to answer a few of your questions/points:

About optimizing our test suite for the various needs of developers. As I understand, it's need more precise machinery for manipulation of different components of harness. And the current flat hierarchy of commands seems too rigid. Actually we have a lot of components: eve, controller (adam in our case), local image server and applications running on eve. For them we have more or less same basic actions set: start, stop, status, and some additional actions like manipulating by different kinds configs, deployment strategies, lists of components and may be some rebuilding/reconfigure of components. For such system seems more reasonable components based hierarchy as described by @deitch -- with components on upper level.

Agreed. The very basic workflow is really just a few top level commands of eden. If you think it would be helpful to document your understanding of what they should do (like literally -- in a sort of a manpage-like format -- feel free to do that).

But if we speaking about more complex solutions accordingly by @deitch 's examples it may need another implementation of test harness with possibility of full modeling of whole complex including network topology between different instances of components. The good example of such system is a http://mininet.org modeling tool/framework developed in Stanford.

Agreed, but lets hold off on that for now.

BTW -- in some cases our tests finished unsuccessfully just because unpredictable rebooting of EVE on QEMU. May be it would be reasonably to think about pre/post hooks functions (or may be chains of such functions) around standard tests which may be enabled/disabled just by configuration without rebuilding. In our case such hook may monitor such rebooting and rerun the test with some reporting about reboot reason. May be such feature may be helpful in other use-cases. What do you think about it, colleagues?

Agreed. We actually need to have eden status or some such that would report the current state of the system (or even more fine grained eden [eve|adam] [status|list]

Also, we need to capture the exit status of qemu (now, it may even be the case that qemu coredumps or something -- and there's not much we can capture -- but lets hope those cases are rare enough). For that, the most useful approach I've seen is to start qemu in a paused (-S) and (-no-shutdown) state, then, from Adam actually resume the execution and collect any kind of reason for exit. Sort of what we're doing here in EVE itself: https://github.com/lf-edge/eve/blob/master/pkg/pillar/hypervisor/kvm.go#L290 https://github.com/lf-edge/eve/blob/master/pkg/pillar/hypervisor/kvm.go#L502 https://github.com/lf-edge/eve/blob/master/pkg/pillar/hypervisor/kvm.go#L516

deitch commented 4 years ago

I go offline for 25 hours, and I miss whole chunks of conversation! :-)

I won't respond to everything, just a few key points:

agree that this is very much "a thing", but lets also agree to keep this part of the workflow out of scope for what we're discussing on this particular issue, ok?

Sure. That "stage 1" part really belongs in lf-edge/eve and not lf-edge/eden anyways, but it should be an explicit goal that we are after.

Agreed on both of your points, but for now -- until upgrade of individual packages is possible -- we're still sticking with rootfs update as a sort of "workaround" for enabling this part of the workflow. Sounds good?

Sure. But I will open a tracking issue on eden - that likely will be around for quite some time - so we have a handle that it is a target. Doing it now.

However, I had this crazy idea of building adam as an EVE package (yup -- no mistake -- you've heard me right). Then when EVE starts it actually connects to a local instance of Adam (running on the very same device EVE is running) and it connects over localhost

Oh, this is super interesting. Let's get this into a separate issue in the eve or adam repo? Actually two issues.

deitch commented 4 years ago

Dang, I hit "Comment" before I was done. More coming.

deitch commented 4 years ago

Nah, on second thought, I will hold my fire for now. We got lots to absorb right here.

sadov commented 4 years ago

Fixed at PR50