transparency-dev / trillian-tessera

Go library for building tile-based transparency logs (tlogs)
Apache License 2.0
8 stars 7 forks source link

Surface futures from Add() #182

Closed AlCutter closed 1 week ago

AlCutter commented 2 weeks ago

This PR returns the futures already present within calls to Add(), rather than blocking the caller directly. This allows the user of Tessera to decide what is best for them.

For HTTP-based services, a blocking call to Add isn't really an issue - there's generally a goroutine per request anyway, but for other use-cases having to artificially spin out a bunch of goroutines which call Add only to then need to join them again immediately afterwards is a bit weird. This PR solves that by allowing the caller of the Add method to choose when to resolve the future.

The POSIX example/one-shot is a good example of the complexity this pushes onto users. Exposing the future removes a lot of that, although adds some (but arguably less) complexity of its own.

phbnf commented 1 week ago

I worry that this might push personalities towards returning to clients, without returning the index, which is a step away from my understanding of Tessera's philosophy to steer folks in the right direction, and a step closer to SCTs. It seems to me that the problem this is solving for POSIX, is that it's requesting to sequence multiple leaves at a time. Should Tessera simply support this? What's the tradeoff here? Do you have other use cases in mind?

AlCutter commented 1 week ago

I worry that this might push personalities towards returning to clients, without returning the index, which is a step away from my understanding of Tessera's philosophy to steer folks in the right direction, and a step closer to SCTs. It seems to me that the problem this is solving for POSIX, is that it's requesting to sequence multiple leaves at a time. Should Tessera simply support this? What's the tradeoff here? Do you have other use cases in mind?

I think that particular concern can be addressed with docs/comments and examples, but do bear in mind that that the only way to check whether there's been an error is also to call the future so I'd be a little surprised if folks just entirely ignored it.

As an aside, supporting adding multiple entries has a bunch of interesting implications for API design and storage contracts, e.g. should the user/storage try to add all entries or none if there's a failure? If it should, then we've just made life very difficult for non-transactional storage implementations and undermined the queuing we currently have. If not, then we also need to figure out some mechanism for returning individual indices and errs for all the entries.

phbnf commented 1 week ago

I think that all your points are valid - and they will persist with futures, but will be carried over to personalities, hence my hesitations. How should a personality react if it receives a tessera.ErrPushback error? What if request A and B gets in, the future for A returns a tessera.ErrPushback, but the future for B goes through?

If futures add risk and complexity to all personalities, except POSIX then I'd say that we need a very good POSIX use case to push the balance towards it, and this is what I'm trying to understand. What's the problem with POSIX reading all entries with a glob statement, and calling Add() sequentially for them, after the previous one has returned? Decreased likelihood that we sequence them simultaneously in the same bundle? Are we aware of any high throughput use case for POSIX today?

AlCutter commented 1 week ago

I think that all your points are valid - and they will persist with futures, but will be carried over to personalities, hence my hesitations.

Could you help me understand what you mean here? Are you talking about the hypothetical multi-add API, or checking errors?

How should a personality react if it receives a tessera.ErrPushback error? What if request A and B gets in, the future for A returns a tessera.ErrPushback, but the future for B goes through?

Is this different from A and B being added in two goroutines?

If futures add risk and complexity to all personalities except POSIX then I'd say that we

I can't see any risk beyond the caller deciding not to bother checking for an error, and if they're going to do that, then I'm not sure there's much we can do.

The complexity overhead for HTTP-like personalities is, literally, (), but for one-shot-like tooling it's the removal of a large amount of concurrency-related complexity, with concomitant wait groups and error handling, etc.

phbnf commented 1 week ago

Could you help me understand what you mean here? Are you talking about the hypothetical multi-add API, or checking errors?

Checking errors.

Is this different from A and B being added in two goroutines?

Running different go routines is a personality decision. Incentivizing them to do so because of Tessera APIs, is a Tessera choice, and extra cognitive load.

I can't see any risk beyond the caller deciding not to bother checking for an error, and if they're going to do that, then I'm not sure there's much we can do. The complexity overhead for HTTP-like personalities is, literally, ()

+1, provided users to everything properly, this is very fine. It's really just two additional characters!

but for one-shot-like tooling it's the removal of a large amount of concurrency-related complexity, with concomitant wait groups and error handling, etc

What's the use case for the one-shot-like tooling?

Before this PR, personalities implementors didn't have to think about futures. After this PR, they will have to understand the concept, and make sure they use them properly. Since we're doing so to simplify of a very specific use case, I think it would be good to document what this use case is for. That's what I don't understand right now, and what make me wonder if it's worth putting additional cognitive complexity on our users.

mhutchinson commented 1 week ago

This is still pre-alpha so I think we should adopt and experiment with this API now. If we learned anything from Trillian v1 it's that hiding the internals from the API was a mistake. In v1 the thing we hid was tiles. Here, we're in danger of hiding an important state transition:

  1. The entry exists only in the personality
  2. The entry is acknowledged by Tessera, but not durably assigned
  3. The entry is durably assigned
  4. The entry is integrated

State 2 here is an important state transition, similar to "positive exchange of flight controls" used by pilots. Exposing a future gives access to state 2, where only returning the index hides it and jumps straight to state 3.

roger2hk commented 1 week ago

Two thoughts on my mind:

  1. What if we provide both sync and async add methods?

  2. Currently there is no way for personality to know whether the entry is integrated or not. Can we add a callback function?

mhutchinson commented 1 week ago
  1. What if we provide both sync and async add methods?

What would the sync version do? Wait until state 3 or state 4 in the description in my last comment?

  1. Currently there is no way for personality to know whether the entry is integrated or not. Can we add a callback function?

This is #193

AlCutter commented 1 week ago

Two thoughts on my mind:

  1. What if we provide both sync and async add methods?

If it were complex to convert between them that might be a good idea, but the sync implementation would be a single line so I don't think it's worthwhile:

func (s Storage) SyncAdd(ctx context.Context, e *Entry) (uint64, error) {
  return s.AsyncAdd(ctx, e)()
 }
  1. Currently there is no way for personality to know whether the entry is integrated or not. Can we add a callback function?

This is relatively straightforward: wait for the checkpoint to have a size larger than the assigned index, but #193 tracks creating a "mixin" for doing that in the personality support packages.

phbnf commented 1 week ago

I think it would make sense for both these APIs to look the same unless there is a specific reason that they don't: either they both return a future, or they both return an index.

mhutchinson commented 1 week ago

I think it would make sense for both these APIs to look the same unless there is a specific reason that they don't: either they both return a future, or they both return an index.

Both of which APIs?

phbnf commented 1 week ago

Tessera's Add() API, and the API offered by the mixin Al is referring to.

roger2hk commented 1 week ago
  1. What if we provide both sync and async add methods?

What would the sync version do? Wait until state 3 or state 4 in the description in my last comment?

  1. Currently there is no way for personality to know whether the entry is integrated or not. Can we add a callback function?

This is #193

The sync version is exactly the one mentioned by @AlCutter. It's indeed a one line wrapper. If we don't provide this method, can we demostrate that in the method doc comment or the unit test?

roger2hk commented 1 week ago

Two thoughts on my mind:

  1. What if we provide both sync and async add methods?

If it were complex to convert between them that might be a good idea, but the sync implementation would be a single line so I don't think it's worthwhile:

func (s Storage) SyncAdd(ctx context.Context, e *Entry) (uint64, error) {
  return s.AsyncAdd(ctx, e)()
 }
  1. Currently there is no way for personality to know whether the entry is integrated or not. Can we add a callback function?

This is relatively straightforward: wait for the checkpoint to have a size larger than the assigned index, but #193 tracks creating a "mixin" for doing that in the personality support packages.

Waiting for the checkpoint to have a size larger than the assigned index is a pull mechanism. With the callback function, that would be a push mechanism. However, there is no personality that would require sending callback URL to the entry submitter (although this is very common in payment system).