coreos / fleet

fleet ties together systemd and etcd into a distributed init system
Apache License 2.0
2.42k stars 303 forks source link

fleetd: support dynamic metadata #1642

Closed dalbani closed 7 years ago

dalbani commented 7 years ago

Source: https://github.com/coreos/fleet/pull/1077

dongsupark commented 7 years ago

First of all, thank a lot for the rebase. This helps us a lot go on with the feature. Is there any chance to write functional tests to cover this feature?

dalbani commented 7 years ago

I have very little experience with Go and fleet, so if you could give me some directions, that would be very useful.

dongsupark commented 7 years ago

@dalbani It's basically about trying to reproduce an exact test scenario inside the fleet's functional test framework, where systemd-nspawn instances run. Normally you could just make use of existing helpers that are already available under the source directory functional. It might be also a good idea to create a sample unit file under functional/fixtures/units.

One of the examples would be a new functional test TestMetadataOperator, which I wrote last week: https://github.com/coreos/fleet/pull/1632/commits/68028c7e5dcae80013124ff34a3794452770ec58 in https://github.com/coreos/fleet/pull/1632. As the original PR didn't include such a test, I needed to write the test on my own. See also https://github.com/coreos/fleet/blob/master/functional/README.md.

dongsupark commented 7 years ago

Gentle ping. Any updates on functional test? BTW I added this PR to milestone v1.0.0, as it could resolve the issue https://github.com/coreos/fleet/issues/555.

dalbani commented 7 years ago

Oh, dear, I had forgotten with the summer and everything :-) In order to first make my PR in sync with the master branch, I've just committed some RPC-related code. How important is it to implement the related methods in registry/rpc/rpcregistry.go by the way? And I've just noticed that the SemaphoreCI build has failed. Do you know why?

dongsupark commented 7 years ago

@dalbani Thanks for pushing the RPC related code. Good point. I think we should also add the MachineMetadata methods to rpc/registry. As for the semaphoreci failures, every grpc-enabled test is failing. That's interesting. I'd need to have a closer look.

dongsupark commented 7 years ago

@dalbani How about adding into RegistryMux like below:

func (r *RegistryMux) MachineState(machID string) (machine.MachineState, error) {
        return r.getRegistry().MachineState(machID)
}

Wouldn't that resolve the testing error of machine being unable to be found, when enable_grpc=on?

dongsupark commented 7 years ago

I had another look. Unfortunately it turns out, there are still things to be done. For the non-gRPC path, it looks fine. Functional tests run without errors. For the gRPC path, necessary methods in RPCRegistry, such as SetMachineMetadata or MachineState, are not implemented. To support the methods, we would need to add also necessary methods into fleet.proto. Then we could support them also in RPCRegistry. That sounds like a challenging weekend project. ;-)

dongsupark commented 7 years ago

Ah, sorry for noise. On Tuesday I misunderstood something. We don't have to do extra work in protobuf. Actually we just need to tweak registryMux to make it call methods from etcdRegistry. So I think we can simply merge this PR, together with an additional functional test. Tomorrow I'll review this PR again, merge it, and I'll create another PR for a simple registryMux fix as well as a functional test for dynamic metadata.

dalbani commented 7 years ago

I am hardly in position to complain, given that I haven't been to the task of writing a functional test :-( (I have been busy lately with another project even though that's not really an excuse.) But I have a remark regarding the HTTP API -- well, in fact, I'd like to reiterate the remark that I made above: https://github.com/coreos/fleet/pull/1642#r71066179 In type machineMetadataOp struct, I would turn:

    Value     string

into:

    Value     struct {
        Value   string  `json:"value"`
    }

That would keep the ability to enhance the API in the future, for example for a TTL feature mentioned (and proof-of-concept implemented). Otherwise, with simply Value string, API-breaking changes would be required.

By the way, I have "allowed edits from maintainers" on this PR, should you want to edit it directly.

dongsupark commented 7 years ago

@dalbani All right, I'll do it for avoiding API breakage. Thanks!