canonical / multipass

Multipass orchestrates virtual Ubuntu instances
https://multipass.run
GNU General Public License v3.0
7.71k stars 641 forks source link

RFC: `multipass config/get/set` structure and user experience #756

Open Saviq opened 5 years ago

Saviq commented 5 years ago

307 has some details on specifics of the "custom repository" feature, but I wanted to collect the user experience and the imaginable extent of the configuration structure here.

The CLI

Three new commands would be introduced, all operating on YAML-formatted data, with . (periods) separating depth levels. All operations need to be atomic and return validation errors.

multipass config [--no-defaults] [--expand] [<subtree>]
# opens an editor in interactive mode
# or prints the YAML of the subtree to standard out in non-interactive mode
# if `--no-defaults` given, only explicitly configured values will be shown
# some parts of the tree may not be shown by default (think remote and instance configuration), passing `--expand` would mean that the whole tree is shown
# on errors returns to the editor with annotation about validation or configuration issues

multipass get <key>
# prints the value of the requested configuration key

multipass set [<key>=<value> …]
# sets the values of the given configuration keys
# or accepts a YAML of a subset of the configuration tree on standard input

The structure

To avoid unnecessary nesting, I propose top-level keys to only be: client and remote names, with remote configuration nested under the remote name.

client:
  default-remote: local       # name of one of the remotes configured below
  primary-name: primary       # name of the primary instance

  launch-defaults:
    image: default            # see `multipass find` for available images or use file:// or http:// URLs
    cpus: 1                   # recommended at most one below your host's cores
    disk: 5GB                 # this is the maximum the instance can use, so go big
    memory: 2GB               # maximum the instance can use, shared between host and instances
    cloud-init: {}            # see http://cloudinit.readthedocs.io/en/latest/topics/examples.html
    mounts:
      /local/path: remote/path
      /other/path:
        target: other/remote/path
        uid_maps:
          *: default          # "*" for "all", "default" for the default user's UID inside the instance,…
        gid_maps:
          *: default          # …or the numeric ID

  images:
    example:                  # this is the image name to use for `multipass launch`
      aliases: [ex]           # an optional list of alternative names
      image: default          # defaults to this image's key above
      cpus: 2                 # overrides the default launch options from above

    lts:                      # because the `lts` alias exists on the remote…
      disk: 25GB              # …this only overrides the disk size when doing `multipass launch lts`

local:                        # configured on first start
  address: unix:/run/multipass_socket  # platform default

  # below is remote configuration
  listen-address: unix:/run/multipass_socket
  driver: qemu                # one of qemu, hyperkit, hyper-v, libvirt - remote platform default
  network: 10.0.6.0/24        # needs extending for bridging and IPv6
  proxy:
    http: proxy://address     # will be used in the instances unless overridden with `--cloud-init`
    https: proxy://address

  default-stream: release     # when using this remote, this will be the default stream used to find images
  streams:                    # base URLs to image remotes
    release: https://cloud-images.ubuntu.com/releases          # may need to be expanded when v3 streams exist
    daily: https://cloud-images.ubuntu.com/daily
    minimal: https://cloud-images.ubuntu.com/minimal/releases

  images:                     # this has the same format as `client.images` above, with lower precedence
    core:
      aliases: [core16]
      image: http://cdimage.ubuntu.com/ubuntu-core/16/stable/current/ubuntu-core-16-amd64.img.xz
    core18:
      aliases: [core18]
      image: http://cdimage.ubuntu.com/ubuntu-core/18/stable/current/ubuntu-core-18-amd64.img.xz
    snapcraft:core16:
      description: "Snapcraft builder for Core 16"
      aliases: [snapcraft:core]
      image: http://cloud-images.ubuntu.com/minimal/releases/xenial/release/ubuntu-16.04-minimal-cloudimg-amd64-disk1.img
      kernel: http://cloud-images.ubuntu.com/releases/xenial/release/unpacked/ubuntu-16.04-server-cloudimg-amd64-vmlinuz-generic
      initrd: http://cloud-images.ubuntu.com/releases/xenial/release/unpacked/ubuntu-16.04-server-cloudimg-amd64-initrd-generic
    snapcraft:core18:
      description: "Snapcraft builder for Core 18"
      image: http://cloud-images.ubuntu.com/minimal/releases/bionic/release/ubuntu-18.04-minimal-cloudimg-amd64-disk1.img
      kernel: http://cloud-images.ubuntu.com/releases/bionic/release/unpacked/ubuntu-18.04-server-cloudimg-amd64-vmlinuz-generic
      initrd: http://cloud-images.ubuntu.com/releases/bionic/release/unpacked/ubuntu-18.04-server-cloudimg-amd64-initrd-generic

  # below are configured instance definitions
  instance-name:             # this is a configured instance on the `local` remote
    name: ~                  # this allows renaming the instance
    cpus: 4                  # this changes the number of CPUs configured for the instance
    memory: 2GB              # this changes the amount of memory available to the instance
    gpus: 1                  # give the instance a single random available GPU

other-remote:
  address: 1.2.3.4:1234      # connecting to a remote might require providing passphrase
  listen-address: 0.0.0.0:1234
  driver: hyperkit           # platform default, changing may not be possible

  instance-name:             # this is a configured instance on the `local` remote
    cpus: 4                  # this changes the number of CPUs configured for the instance
    memory: 2GB              # this changes the amount of memory available to the instance
    gpus:                    # give the instance the listed GPUs
    - 0:1:1                  # GPU identifier, format platform-dependent
    - 0:1:2

That's the extent of configuration that I can come up with right now, please have a look through for errors, missing bits or plain stupidity. ;)

Doubts:

ricab commented 5 years ago

I think this is quite a big deal, I am still trying to wrap my head around it, so sorry if comments and question below sound confused or contradictory.

Overall, it is nice to have a long view of where to go, but perhaps it would be more manageable to split things into elementary steps (identifying dependencies/precedences early and then considering each one in detail).

Perhaps we don't need to resolve the end-goal in detail for now. In particular, the exact final schema is less of an issue to begin with IMO. Personally my concerns go more toward how the feature would work (whatever the exact config contents). Here are some questions and things to think about:

  1. general symbolic representation/structure
    • [x] this seems to be settled as some form of property tree
  2. persisted format
    • [x] yaml, also looks consensual
    • [ ] how many files really? full copies or each entity with its own file and its own config?
  3. format in memory - do we...
    • [x] disperse information over attributes of program entities (e.g. daemon.backend = libvirt, foo.bar.doesBlah = true)?
    • [ ] or keep information in a centralized DB that all entities refer back to?
      • [ ] keep yaml objects in mem and keep referring back to them?
      • [ ] transform it to some other intermediate format?
        • [ ] custom? (like current DaemonConfig)
        • [ ] general? (e.g. boost ptrees)?
          • [ ] how does this affect type-checking and validation?
  4. Instances are currently persistified by the daemon, so mixing in instances with the config means the daemons are not only consumers but also producers
    • [ ] do we change that premise?
    • [ ] I suppose the clients would still be responsible for persistifying their config? Do they all write to the same file? So we need file locking?
  5. How do we drive "online" daemon updates (while running)?
    • [ ] when using the client to change the config, I suppose we could just notify the appropriate "remotes". But what if the backend file is edited directly? do daemons poll in some sort of loop? careful with impacts on thread-safety and atomicity...
  6. at what point in the chain of consuming a yaml do we approach validation?
    • [x] ideally ASAP upon reading (I suppose) but we may not have all necessary info to judge at that point. Some things may depend on state or other information external to configuration and reading entity (e.g. that port is already taken; that gpu config is invalid in this system...);
    • [x] I suppose the client would have to wait for confirmation from the daemon(s)?
  7. the configuration needs to be conveyed over the network, but still written to locally
    • [ ] again, how is everything kept in sync? which is the authoritative version?
    • [ ] how do daemons inform clients of changed configs? do they initiate communications? inform them on next contact? have to keep track of what clients know what?
  8. So we want multiple remotes... We know we can also have multiple clients...
    • [ ] so we have a distributed n-to-n relationship? and we need to implement a (small-scale) file-backed DB that needs to be replicated and atomically read and written by every intervenient? Might we better off using some existing NoSQL solution rather than implementing it all ourselves?
  9. concerning the config cli, it is different from both lxc, git... Is it modeled after something else? Otherwise wouldn't it be better to stick to something existing?
Saviq commented 5 years ago

Perhaps we don't need to resolve the end-goal in detail for now. In particular, the exact final schema is less of an issue to begin with IMO. Personally my concerns go more toward how the feature would work (whatever the exact config contents). Here are some questions and things to think about:

Sure, this was a brain-dump for determining the user experience of all this - for that we need to look ahead so we don't step in the wrong direction.

I wanted this issue to only deal with how the data is presented to the user in a CLI environment, I don't put any requirements on how it is handled internally by either the client or daemon. I have omitted your comments that deal with that.

  1. general symbolic representation/structure
    • [x] this seems to be settled as some form of property tree

Yes, that is my proposal and the direction a lot of our projects are going.

  1. Instances are currently persistified by the daemon, so mixing in instances with the config means the daemons are not only consumers but also producers

Not sure what you mean here. Adding an instance entry in the YAML above would be an error (instance not found), removing it would be a no-op on the instance configuration.

  1. at what point in the chain of consuming a yaml do we approach validation?

Each entity should be responsible for validating their own piece of the configuration (the client - the client. subtree, each remote their own).

  1. the configuration needs to be conveyed over the network, but still written to locally

The YAML is just a presentation format to the user. To show a remote's configuration, the client has to ask the remote to provide it with its current config. No remote config is persisted by the client, it is just sent over the wire.

  1. concerning the config cli, it is different from both lxc, git... Is it modeled after something else? Otherwise wouldn't it be better to stick to something existing?

It is definitely inspired by LXD, but early on we decided to not have nested commands (lxc config device edit…) to keep our CLI clean. What kind of changes would you suggest?

ricab commented 5 years ago

Thanks for the reply. OK, perhaps I missed the scope here. I thought this was +/- go for implementation and felt grasping for the architecture part. Anyway, you already clarified some things.

Yes, that is my proposal and the direction a lot of our projects are going.

Sure, sounds like the way to go.

Not sure what you mean here [...]

Oh, I was also under the impression that this config would end up replacing the current image and instance DBs. That would include state info not necessarily driven by the client, which is why I mentioned the daemon as producer. If this is only strict config, that makes it simpler (which often implies better :wink:)

Each entity should be responsible for validating their own piece of the configuration [...]

OK, I was also thinking about each entity internally. Immediate validation means the effects of a multipass set need to be synchronous. It can't be just writing info somewhere that objects later read as needed, so this impacts my point 3 above (info would have to be propagated).

[...] No remote config is persisted by the client [...]

OK, so we're separating client/remote configs, whatever the backend. I wasn't sure you wanted that.

It is definitely inspired by LXD [...] What kind of changes would you suggest?

git config is quite similar but nesting (only) the commands (not the contents). So getting, setting, unsetting, all are achieved with git config (+ appropriate option). And there is no '=' between key and value. If we followed that, we could do

This would make it clear we were talking about configuration (plain get/set could be less clear). It would allow grouping config documentation under a single command and it would probably be more familiar.

Saviq commented 5 years ago

git config is quite similar but nesting (only) the commands (not the contents). So getting, setting, unsetting, all are achieved with git config (+ appropriate option). And there is no '=' between key and value. If we followed that, we could do

* `multipass config <key>` to get a yaml tree
* `multipass config <key> <val>` to set
* `multipass config --add <key> <val>` to add
* `multipass config --get <key>` to get, etc.

The --add / --get seem to me like nested commands even if they pretend not to be… What happens if you multipass config --add <key> <val> --get <key> --add <key> <val>? Suddenly options actually become positional arguments…

This would make it clear we were talking about configuration (plain get/set could be less clear). It would allow grouping config documentation under a single command and it would probably be more familiar.

Maybe we can still do with config alone…

ricab commented 5 years ago

The --add / --get seem to me like nested commands even if they pretend not to be

They are, that is why I said nesting (only) the commands (not the contents). I thought it could be worth considering.

What happens if you multipass config --add --get --add

They would be mutually exclusive (see git help config).

Maybe we can still do with config alone…

Hmm, I prefer the original then. To be fair, I meanwhile noticed get/set would be consistent with snap.

ricab commented 5 years ago

@Saviq: how do you feel about replacing client.primary-name with client.petenv-name? I would favor it, given that "petenv" is already spread throughout code ids, while "primary" is isolated as the current value of the petenv_name constant.

Saviq commented 5 years ago

@ricab it's for presentation to the user, we'd have to explain to the user what it means, and why do we call it primary there but petenv here.

ricab commented 5 years ago

Right, good point.

Saviq commented 4 years ago

We've been iterating on how could an "integrate with $app" configuration work and I'd like to document what we're thinking:

client:
  apps:
    windows-terminal:
      profiles: primary  # or "none", "all"

    windows-terminal-private:  # Windows Terminal with a custom settings path
      type: windows-terminal
      settings-file: C:\some\path\settings.json
      profiles: all

    iterm2:
      profiles: primary

    lxc:
      remotes: all
      prefix: mp-

    lxc-beta:  # a parallel install of LXD from the `beta` channel
      remotes: all
      command: snap run lxd_beta.lxc

  gui:
    terminal-app: windows-terminal-private  # pointer at one of client.apps above

Each app could have a distinct set of properties that can be extended over time. The type would default to the key in the client.apps map, and would route the settings below to the right integration points.

ricab commented 4 years ago

From further discussion on IRC, it would be nice to support a general terminal, where the user could specify the executable/command, and a linux-terminal, where the user could specify a terminal that was known to the system via a desktop file.

Saviq commented 4 years ago

From further discussion on IRC, it would be nice to support a general terminal, where the user could specify the executable/command, and a linux-terminal, where the user could specify a terminal that was known to the system via a desktop file.

Sure, the above is not meant to be exhaustive :)

ricab commented 4 years ago

Sure, the above is not meant to be exhaustive :)

I know, just wanted to record it for the future :slightly_smiling_face:

Saviq commented 2 years ago

So here's a case that I don't think we've fully considered:

$ multipass launch --name driver
Launched: driver

$ multipass get --keys
client.gui.autostart
client.gui.hotkey
client.primary-name
local.bridged-network
local.driver
local.driver.cpus
local.driver.disk
local.driver.memory

I think we need to plan for <remote>.instances.<instance-name>.* to be the disambiguated key (e.g. for multipass config), with local.<instance-name>.* being a shorthand, where not ambiguous (e.g. multipass get/set).