etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.83k stars 9.77k forks source link

v3api: Atomically creating and attach a lease; LeaseKeepAlive TTL #4146

Closed barakmich closed 8 years ago

barakmich commented 8 years ago

Two parts:

1) What's the intended method for creating a key and a lease in a single transaction?

2) Why does a LeaseKeepAliveRequest not have a TTL field, telling it how long the new TTL should be?

xiang90 commented 8 years ago

@barakmich

1) What's the intended method for creating a key and a lease in a single transaction?

Previously you cannot attach lease and create the key atomically. We have a proposal about doing TxnWithLease, but we feel it is complicated. So we just changed the API, Now key has a lease field. You can creating a key and attach it to a lease in one transaction. https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto#L93. But you still need to create lease first. Do you have a use case in mind?

2) Why does a LeaseKeepAliveRequest not have a TTL field, telling it how long the new TTL should be?

Lease will be renewed with the advisory TTL (or longer) that the user specified when creating the lease. We might add a RPC to modify the TTL of a lease if needed.

barakmich commented 8 years ago

@xiang90:

@barakmich

1) What's the intended method for creating a key and a lease in a single transaction?

Previously you cannot attach lease and create the key atomically. We have a proposal about doing TxnWithLease, but we feel it is complicated. So we just changed the API, Now key has a lease field. You can creating a key and attach it to a lease in one transaction. https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto#L93. But you still need to create lease first. Do you have a use case in mind?

You can create a key and attach a lease in one transaction, true, but you can't create the lease at the same time. This is what I'd like to do. Right now, the pattern seems to be -> Create Lease <- Get Lease ID arbitrary delay -- (this is the potential for bugs. The lease may be over.) -> Set key with Lease ID I just recieved <- Key set. Or already deleted. Or ????

The use case is simple. I want to emulate TTL on keys with leases. Leases are better, clearly, but if the two are associated, I should be able to create both at the same time.

A proposal here would be to have some flag or value on setting a key that also creates a lease for it.

2) Why does a LeaseKeepAliveRequest not have a TTL field, telling it how long the new TTL should be?

Lease will be renewed with the advisory TTL (or longer) that the user specified when creating the lease. We might add a RPC to modify the TTL of a lease if needed.

If you're keeping it alive with a keep alive request, then you should be able to say how long to keep it alive for. Though not setting this value would likely be fine to renew with the advisory value.

xiang90 commented 8 years ago

The use case is simple. I want to emulate TTL on keys with leases. Leases are better, clearly, but if the two are associated, I should be able to create both at the same time.

We were trying to avoid the two API have strong relationship. Probably we can add a put option to create a lease if it does not exist before.

heyitsanthony commented 8 years ago

1) can be solved by reporting the PUT failed if the lease is already expired 2) arbitrary user-defined term lengths seems to go against the spirit of leases as a fault tolerance mechanism / false sharing limiter

xiang90 commented 8 years ago

@barakmich

-> Create Lease <- Get Lease ID arbitrary delay -- (this is the potential for bugs. The lease may be over.) -> Set key with Lease ID I just recieved <- Key set. Or already deleted. Or ????

In this case, the key set with an expired leaseID should fail with lease not exist. Then the user knows he was experiencing a long delay.

barakmich commented 8 years ago

1) PUT failing works, I suppose.

@heyitsanthony:

2) arbitrary user-defined term lengths seems to go against the spirit of leases as a fault tolerance mechanism / false sharing limiter

I'd argue that the notion is you're sharing the lease ID, and if you're running the same code you already know what the standard ttl is; and if you don't, you can always send it without changing it. But this is a nice to have, I guess.

nekto0n commented 8 years ago

Probably we can add a put option to create a lease if it does not exist before.

Any plans to implement that? My use case is exactly like TTL feature missing in etcdv3. I have a client that either creates a node or updates it. I want to be notified when clients is dead by receiving EXPIRE event from etcd. Leases are a great thing, but in v2 I just could always use Set() with TTL and be good. Can this behavior be implemented in v3?

xiang90 commented 8 years ago

I have a client that either creates a node or updates it. I want to be notified when clients is dead by receiving EXPIRE event from etcd. Leases are a great thing, but in v2 I just could always use Set() with TTL and be good. Can this behavior be implemented in v3?

When the client starts, it creates a lease. All the keys it creates can be attached with this lease. Create a /prefix/clientID-expire key with the lease to notify other clients its death. Does this work for your case?

nekto0n commented 8 years ago

Sorry, I'll rephrase my case. I have a server, which accepts clients periodic status reports. Each report is transformed into a call Set(key, value, ttl) to etcdv2. Thus if client goes silent for too long, my watcher will receive expiration event and can deal with it according to business rules. In v2 every client request results in a single call to etcd. In v3 this use case seems to require several steps (get key, check it's lease, keepalive on this lease, etc). Can v3 API be adjusted to implement this case?

Now I'm thinking, that maybe this is not the right way to organize things. Maybe I should elect a leader , set a watchdog and put every status report in some kind of priority queue to manually expire (delete or explicitly set status "OFFLINE") based on last report time. I.e. implement expiration logic not via etcd ttl, but manually. This does add some complexity, but brings control.