coreos / fleet

fleet ties together systemd and etcd into a distributed init system
Apache License 2.0
2.42k stars 302 forks source link

systemd oneshots #562

Open bcwaldon opened 10 years ago

bcwaldon commented 10 years ago

We need real support for systemd oneshot units. Not sure if this is different from the concept of a "batch job".

jonboulle commented 10 years ago

To add a little more colour to this:

Currently, when a oneshot job with targetState=launched (i.e. started by a user) completes, no action is taken by fleet; the targetState of the Job remains launched. If the machine on which it ran happens to go down, fleet will reschedule the job to another machine and start it again (since the targetState is still launched).

We could react to the event from systemd indicating that the job has completed running, but it's unclear what the appropriate behaviour would be. Arguably changing targetState to loaded would provide the desired semantics (the job would get rescheduled but not automatically started again), but we are loath to have the fleet daemon manipulating targetStates, since that is really something which should only be set by users.

And to be clear, we absolutely need to reschedule oneshot jobs when their host machine goes down, whether they have completed or not, because they may be triggered by other units (e.g. timer units); it's perfectly sensible in fact for a oneshot to never have targetState=launched at all.

adamalex commented 10 years ago

I have a similar use case where I am using a timer to trigger another service that launches a daily backup. I see the same behavior when using oneshot, and without it the service stays "running" even though the task completes after just a few seconds. Either way it seems the service can't be triggered again by the timer unless it is manually stopped. Is there a combination of settings that will currently work to allow a recurring utility task? Ideally once per day the task will be triggered by the timer and will run on any available cluster machine.

jonboulle commented 10 years ago

@adamalex I'm a bit confused by this part

Either way it seems the service can't be triggered again by the timer unless it is manually stopped.

Do you have a test case I can use to reproduce?

adamalex commented 10 years ago

I did some more testing and found my results that you quoted above to happen only when I fleetctl start a unit outside of the timer. In that case it works just like you mentioned above, remaining in launched state. This behaves as I found where an additional fleetctl start has no effect unless a fleetctl stop is issued first. From rereading your messages above this seems to be known current behavior.

The good news that came out of my testing is that I found when only using timers to launch cron-like jobs (avoiding fleetctl start on the services themselves) it all works great!

I am using these units for testing: https://gist.github.com/adamalex/b5fb7f6b42caba4c3413

The test timers fire the test services every minute at ::00 and appear as below between runs. They run every minute on schedule. Also, when a machine is removed from the cluster any units on that machine automatically migrate to another machine as expected. This fully solves the use case I was going after, which I now understand may differ from the focus of the originally-reported issue here.

core@core-01 ~/share/services $ fleetctl list-units
UNIT            STATE       LOAD    ACTIVE      SUB DESC        MACHINE
echooneshot.service loaded      loaded  inactive    dead    echooneshot 546f1fab.../172.17.8.101
echooneshot.timer   launched    loaded  active      waiting echooneshot 546f1fab.../172.17.8.101
echosimple.service  loaded      loaded  inactive    dead    echosimple  546f1fab.../172.17.8.101
echosimple.timer    launched    loaded  active      waiting echosimple  546f1fab.../172.17.8.101

core@core-01 ~/share/services $ fleetctl status *
● echooneshot.service - echooneshot
   Loaded: loaded (/run/fleet/units/echooneshot.service; linked-runtime)
   Active: inactive (dead) since Tue 2014-07-01 03:43:00 UTC; 24s ago
  Process: 1109 ExecStart=/usr/bin/echo hello world (code=exited, status=0/SUCCESS)
 Main PID: 1109 (code=exited, status=0/SUCCESS)

Jul 01 03:43:00 core-01 systemd[1]: Starting echooneshot...
Jul 01 03:43:00 core-01 systemd[1]: Started echooneshot.
Jul 01 03:43:00 core-01 echo[1109]: hello world

● echooneshot.timer - echooneshot
   Loaded: loaded (/run/fleet/units/echooneshot.timer; linked-runtime)
   Active: active (waiting) since Tue 2014-07-01 03:39:57 UTC; 3min 27s ago

Jul 01 03:39:57 core-01 systemd[1]: Starting echooneshot.
Jul 01 03:39:57 core-01 systemd[1]: Started echooneshot.

● echosimple.service - echosimple
   Loaded: loaded (/run/fleet/units/echosimple.service; linked-runtime)
   Active: inactive (dead) since Tue 2014-07-01 03:43:00 UTC; 24s ago
  Process: 1108 ExecStart=/usr/bin/echo hello world (code=exited, status=0/SUCCESS)
 Main PID: 1108 (code=exited, status=0/SUCCESS)

Jul 01 03:43:00 core-01 systemd[1]: Starting echosimple...
Jul 01 03:43:00 core-01 systemd[1]: Started echosimple.
Jul 01 03:43:00 core-01 echo[1108]: hello world

● echosimple.timer - echosimple
   Loaded: loaded (/run/fleet/units/echosimple.timer; linked-runtime)
   Active: active (waiting) since Tue 2014-07-01 03:39:57 UTC; 3min 27s ago

Jul 01 03:39:57 core-01 systemd[1]: Starting echosimple.
Jul 01 03:39:57 core-01 systemd[1]: Started echosimple.
tclavier commented 10 years ago

In addition, for long oneshot services scheduled many times, for example a continuous integration system, I want to have that :

The third step is probably an option :-D some systems want to stack calls.

bcwaldon commented 10 years ago

Relevant - https://github.com/coreos/fleet/issues/240

OAGr commented 9 years ago

Any update on this? I'm also using oneshot services for a continuous integration system. Right now they ping back a remote server on completion, which sends a message to fleet to destroy the service.

bcwaldon commented 9 years ago

@OAGr tl;dr: no

Support for oneshots can't easily be layered in, as fleet's operation is built around a declarative model. For example, "I want foo.service to be launched somewhere". I'm more than happy to work with someone to figure out how we can properly support oneshots, but it will be nontrivial.

ngauthier commented 9 years ago

Just to give more feedback and info here, we're using oneshots also as part of CI. We are also doing it via the HTTP API.

To find out if a oneshot has started, we can look at:

unit, _ := api.Unit(service)
unit.CurrentState

However, there's no distinction between being started and finishing. For that, we use the direct systemd state call:

service := "myoneshot.service"

unitStates, _ := api.UnitStates()

var unitState *schema.UnitState
for _, s := range unitStates {
    if s.Name == service {
        unitState = s
        break
    }
}

if unitState == nil {
    // we don't know about this unit
    return errors.New("unit not in state list")
}

if unitState.SystemdActiveState == "inactive" {
    // unit has stopped well
} else {
    // unit hasn't stopped
}

This is far less optimal because we have to ask for every service's state then find our service and check its systemd state.

Would be awesome to extend Unit in the schema with a systemd state attribute so the api.Unit(service) call would return systemd state.

Or, as mentioned here, have a fleet state for a completed oneshot service or something similar.

Edit: please note I'm not checking errors in this example for brevity. Check your errors! :-)

bcwaldon commented 9 years ago

@ngauthier I don't believe the suggestion to provide unit state in the Unit() call is relevant to the issue of native oneshot support... unless I'm missing something.

As I stated above, fleet is sort of designed against oneshots right now. Do you have an idea of how oneshots could be better supported through the fleet internals?

ngauthier commented 9 years ago

I kind of considered it part of native oneshot support because the fleet status doesn't reflect the actual oneshot status (systemd's representation of oneshot status is richer than fleets, so it can represent the completed oneshot, which is why I use systemd's state). The main reason I posted the example was because I ended up on this thread trying to work with oneshots in fleet, so I hoped it would help others.

As far as fleet's internals, I am probably too inexperienced to suggest an amazing solution, but one thought I had was to show a completed oneshot as being in the loaded state when it's done, not in the activate state. So it ran and it's done and now it's just loaded. But I guess it would be hard to tell if it ran at all.

xied75 commented 9 years ago

Dear all I would like to verify with you guys about a use case:

I've got a cluster up and running, and my fleet unit need to git clone a private repo, which means I need to get a private key for root on each of the node so that they can access my gitlab via Deploykey.

In an ideal world, you do this in the cloud-config.yaml, assume that you knows everything needed before you deploy the cluster. But you always find something is missing from the cluster later.

I was also thinking should this be a job rather for Ansible/Puppet, in that I want to 'state' control my nodes to be in a specific 'state'.

But I tend to reduce the tooling needed to have a unified story, i.e. I'll write a simple fleet unit to dump the private key for root. Which apparently is 'oneshot', and global. And it should disappear afterwards.

Even further, what if we could do

fleetctl run whatever-arbitrary-bash

Then fleetd will magically turn this into a oneshot unit and fire it.

djmaze commented 9 years ago

Just as a side note because this seems not known widely: You can make the server re-evaluate a cloud config file without rebooting using the coreos-cloudinit binary. This is particularly useful if you have a config drive which can be modified on-the-fly (e.g. with qemu on the host):

sudo coreos-cloudinit --from-configdrive=/media/configvirtfs

So you just need a way to fire that command all servers. Could be made a global oneshot unit..

I admit this does not help with most cloud providers though, as they don't let you modify the cloud config of a running server.

rdark commented 9 years ago

@xied75 - I solved that problem (I actually reached it first by hitting maximum user-data size on AWS, but it solves this issue also) by chaining additional cloudinit scripts from an external URL (in this case it's stored in an s3 bucket reached via a VPC endpoint. This is what's in my user-data populated cloudinit:

- name: cloudinit-includes.service
  command: start
  content: |
    [Unit]
    Description=Install External Cloudinit Scripts
    ConditionPathExists=/etc/custom_environment
    [Service]
    EnvironmentFile=/etc/custom_environment
    Type=oneshot
    RemainAfterExit=yes
    ExecStart=/usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_cloudinit-includes.yaml

I have a bucket for each cluster, content of which is stored in git, with a branch for each cluster that corresponds to a bucket. The bucket gets populated via a git post-recieve hook, so I can roll out changes in a semi-controlled fashion. The above URL gets polled at every boot (or restart of cloudinit), and pulls down a chainloader for other cloud-init things.

#cloud-config
coreos:
  units:
    - name: install-cloudinit-includes.service
      command: start
      content: |
        [Unit]
        Description=Cloudinit External Units
        ConditionPathExists=/etc/custom_environment
        [Service]
        Type=oneshot
        EnvironmentFile=/etc/custom_environment
        RemainAfterExit=yes
        ExecStart=/usr/bin/bash -c '/usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/install-ca-cert.pem.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/write_files-iptables-rules-save.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_update-custom-ca-certificates.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_install_etcd_backup.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_install_jq_lookup.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_install_etcdctl_hosts_lookup.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_install_etcdctl_get_coreos_hosts.yaml && \
                                    /usr/bin/coreos-cloudinit -from-url ${CLOUDINIT_INCL_URL}/coreos/common/cloud_config-coreos_units_iptables-restore-custom.yaml'
lynchc commented 8 years ago

+1 @xied75

the idea of having fleetctl run whatever-arbitrary-bash would solve all of this. I use a lot of oneshots for CI and deploying. Having it take some filtering on machine metadata as advanced option would really be nice. I can't imagine this would be too difficult and sounds like it would solve all of these problems, no?

jgunthorpe commented 8 years ago

I also would like to see one shot units work better.

@bcwaldon - All declarative models work by forcing a current state to a desired state, so it appears the missing element for one short support is a persistent log of one shot completion timestamp (ie the state component, one shots cannot use the 'current system' as a state).

A reasonable model is for fleet to monitor the one shot service after starting it. When it sees it has completed, it records the completion timestamp in a persistent global log (etcd I guess?). The declarative forcing function for one shots is then "start if unit X never completed after requested start time Y, else stop". real time is used to provide a fence for allowing the user to trigger the one shot repeatedly.

That is all pretty simple for singleton units (the 'log' is just a single value), global units are a bit more complex and require a storing a map of machine id and completion time ..