openfga / community

The Community repository is the place to go for OpenFGA support
https://openfga.dev
Apache License 2.0
27 stars 32 forks source link

[Request] Allow non-transactional writes (make WriteRequest idempotent) #265

Closed MidasLamb closed 6 months ago

MidasLamb commented 2 years ago

It would be nice that sending the same write request twice would result in the same outcome. However for the second call, the SDK will return an error saying that

cannot write a tuple which already exists: <tuple>

However, as a user I'd expect this call to succeed as the desired end state is that the tuple exists, so if it already exists before I make the call, this is essentially a no-op to end up in the desired end state, rather than a failure/error.

craigpastro commented 2 years ago

Hi @MidasLamb! The reason for this behaviour at the moment is due to the watch API. The watch API is meant to be an ordered history of tuples that have been added and deleted. The first time you add a tuple it will be reflected in the watch API. What should happen the second time the same tuple is added? It probably shouldn't be added to the watch API again as it is not really being added to OpenFGA again, but if you think it is the first time you are adding it, then you may expect to see it in the watch API. I guess this is why we don't allow it at the moment. Also, a similar case with delete.

We do know that this is probably not the most ideal solution, and you are not the first one to raise this issue to be sure. We are discussing alternatives internally and would love to discuss more. If you have any further ideas or thoughts we would be happy to hear them. And, of course, we will update this issue with any ideas of our own.

Thank you!

craigpastro commented 2 years ago

Hi @MidasLamb. Just following up here. We decided to go ahead with what you suggested by make a duplicate write a noop. No changes to the tuple table, nor the changelog table, and no error.

If you have a sec, could you please confirm whether the MySQL implementation looks sane? I went with the “on duplicate key update store=store” approach. Thanks!

By the way, what do you think about the same behaviour for deletes. Currently, if you try to delete a tuple that does not exist you get an error. Perhaps a noop here too?

MidasLamb commented 1 year ago

@craigpastro , I've checked the PR and it looks good!

I think for deletes having a noop would be a good approach as well. From a DX point of view, most devs would be concerned with the state after their API calls. I'd also consider the watch API to be for the "more advanced user", so moving the "caveats" there makes more sense IMO

craigpastro commented 1 year ago

Thanks for taking a look, @MidasLamb!

aaguiarz commented 1 year ago

Hi Midas!

After some discussion, we decided to not to move forward with this change for now. This change might have an adverse effect depending on the underlying database and how it manages transactions.

While we decided it to shelve it for now, in the future we’d like to work on an alternative that would provide a better dev experience without removing the consistency protections this currently offers. Additionally, while we think that this change is very useful when you are evaluating the product and writing unit tests, but it’s not clear to us yet how valuable it is for production applications.

Let us know how problematic you find that.

Do you feel this will be an issue for you in production?

MidasLamb commented 1 year ago

Hey @aaguiarz, I think for the production applications, having that logic for ignoring duplicates could help with reducing latencies. Currently if you want to ignore duplicate writes, you're forced to something along the lines of:

rhamzeh commented 1 year ago

Hey @MidasLamb!

We thought we'd expand a bit about our reasoning for dropping this for now.

In distributed environments, this provides a small consistency check for the user.

Imagine you have a single Postgres DB serving multiple OpenFGA servers. Take the below scenarios, from an app perspective it is sending the data in the same order, the only difference is the order they get written in internally.

The app issues 4 calls:

  1. Write(user=user:anne, relation=reader, object=resource:roadmap)
  2. Delete(user=user:anne, relation=reader, object=resource:roadmap)
  3. Write(user=user:anne, relation=reader, object=resource:roadmap)
  4. Check(user=user:anne, relation=reader, object=resource:roadmap)

What is the expected response?

With idempotent writes, the answer is: It depends, based on the ordering, latency between the application, the OpenFGA server and the OpenFGA DB

(A)

sequenceDiagram
    participant A1 as Application
    participant S1 as OpenFGA Node 1
    participant S2 as OpenFGA Node 2
    participant D1 as OpenFGA Postgres DB
    A1->>S1: 1. Write(user=user:anne, relation=reader, object=resource:roadmap)
    S1->>D1: 1a. INSERT ...;
    D1->>S1: 1b. OK
    A1->>S1: 2. Delete(user=user:anne, relation=reader, object=resource:roadmap)
    S1->>D1: 2a. DELETE ... WHERE ...;
    D1->>S1: 2b. OK
    A1->>S2: 3. Write(user=user:anne, relation=reader, object=resource:roadmap)
    S2->>D1: 3a. INSERT ...;
    D1->>S2: 3b. OK
    A1->>S1: 4. Check(user=user:anne, relation=reader, object=resource:roadmap)

(B)

sequenceDiagram
    participant A1 as Application
    participant S1 as OpenFGA Node 1
    participant S2 as OpenFGA Node 2
    participant D1 as OpenFGA Postgres DB
    A1->>S1: 1. Write(user=user:anne, relation=reader, object=resource:roadmap)
    S1->>D1: 1a. INSERT ...;
    D1->>S1: 1b. OK
    A1->>S1: 2. Delete(user=user:anne, relation=reader, object=resource:roadmap)
    A1->>S2: 3. Write(user=user:anne, relation=reader, object=resource:roadmap)
    S2->>D1: 3a. INSERT ...;
    S1->>D1: 2a. DELETE ... WHERE ...;
    D1->>S2: 3b. OK
    D1->>S1: 2b. OK
    A1->>S1: 4. Check(user=user:anne, relation=reader, object=resource:roadmap)

From the app perspective, it has no way of knowing whether situation (A) or (B) happened.

By enforcing transactional writes, if (3) happened before (2) (Scenario B), the user will get an error. They can retry if this was intentional, but the result itself will be more consistent.

Now does this protect against all of the consistency issues in OpenFGA? No. There's still several places where this situation may be encountered, but at least it does provide a bit of extra protection.

So allowing idempotent writes could cause issues for some users. So why not make it configurable?

Does that help a bit in understanding our thought process around this?

nediamond commented 9 months ago

i believe the zanzibar paper mentions a touch (as opposed to insert) operation - maybe this is a good way to frame it?

nediamond commented 9 months ago

what about this: adding an additional param touches to the RelationshipTuples /write api? which logically will behave very similar to writes (executing afterwards?) but terminating right around here except building a query that has ON CONFLICT DO NOTHING (maybe only possible in pg?)

evankanderson commented 9 months ago

I just ran across the same thing, and speed-reading through the issue, it looks like this part of what zookies are intended to avoid. In OpenFGA, this would look like returning a concurrency token associated with the object which can be stored in the application data store for each Write or Delete call, and then the app would pass the concurrency token with each Check call; calls which don't reference the latest concurrency token indicate that there may have been an in-flight update, and should return a retriable error.

In your example A, the concurrency token would be from 3b., while in example B, the concurrency token would be from transaction 2b. If a Check request came in with the 3b concurrency token, it would be allowed in example A, while being rejected as "stale token" in example B. For existing and naive users who don't care about the "new enemy" problem, an empty concurrency token would be equivalent to "don't care, read latest", and would be non-deterministic in the case of racing writes and deletes.