Implement a YugaByteDB datastore

jzelinskie commented 2 years ago

Creating this issue to gauge interest.

shinji62 commented 10 months ago

@jzelinskie Seems almost 10 person could be interested in the Yugabyte Datastore.

As a test I try to run the postgresql test against YugabyteDB but face two issue :

ALTER TABLE namespace_config ADD COLUMN id BIGSERIAL PRIMARY KEY failed because YDB do not support adding another primary key yet because the sharding is based on the primary key but i guess that could be solve with a new migration script.

Once fixing the previous issue I am getting the following error as well

?       github.com/authzed/spicedb/internal/datastore/postgres/common   [no test files]
?       github.com/authzed/spicedb/internal/datastore/postgres/migrations       [no test files]
?       github.com/authzed/spicedb/internal/datastore/postgres/version  [no test files]
--- FAIL: TestPostgresDatastoreWithoutCommitTimestamps (0.00s)
    --- FAIL: TestPostgresDatastoreWithoutCommitTimestamps/postgres-13.8 (23.89s)
        postgres.go:113: 
                Error Trace:    /Users/gwenn/work/github/spicedb/internal/testserver/datastore/postgres.go:113
                                                        /Users/gwenn/work/github/spicedb/internal/datastore/postgres/postgres_test.go:200
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:33
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/namespace.go:38
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:72
                Error:          Received unexpected error:
                                unable to compute head revision: multiple or zero head revisions found: [add-rel-by-alive-resource-relation-subject add-unique-datastore-id]
                Test:           TestPostgresDatastoreWithoutCommitTimestamps/postgres-13.8
        --- FAIL: TestPostgresDatastoreWithoutCommitTimestamps/postgres-13.8/TestNamespaceNotFound (5.08s)
            testing.go:1490: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test
--- FAIL: TestPostgresDatastore (0.00s)
    --- FAIL: TestPostgresDatastore/postgres-13.8-head- (26.66s)
        postgres.go:113: 
                Error Trace:    /Users/gwenn/work/github/spicedb/internal/testserver/datastore/postgres.go:113
                                                        /Users/gwenn/work/github/spicedb/internal/datastore/postgres/postgres_test.go:75
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:33
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/namespace.go:38
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:72
                Error:          Received unexpected error:
                                unable to compute head revision: multiple or zero head revisions found: [add-rel-by-alive-resource-relation-subject add-unique-datastore-id]
                Test:           TestPostgresDatastore/postgres-13.8-head-
        --- FAIL: TestPostgresDatastore/postgres-13.8-head-/TestNamespaceNotFound (5.28s)
            testing.go:1490: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test
FAIL
FAIL    github.com/authzed/spicedb/internal/datastore/postgres  27.739s
FAIL
Error: running "go ./internal/datastore/postgres/..." failed with exit code 1
exit status 1

For that on I am not sure what the issue is.

I think creating a Ydb datastore should be that hard, as the compatibility with PG is high.

vroldanbet commented 10 months ago

@shinji62 We don't think implementing Yugabyte would be difficult, however, adding a new datastore needs to be carefully analyzed. Each datastore has its own quirks and limitations, and evolving SpiceDB with N datastores underlying it adds considerable engineering and maintenance overhead. Just deciphering the mysteries of each database query planner involves a non-trivial amount of time, making sure performance does not regress as the service on top evolves, migrations, testing with multiple versions, accounting for the specifics of each datastore client configuration, tunning...

It's not difficult, but it does not come for free.

shinji62 commented 10 months ago

@vroldanbet First an apology, I wasn't trying to under estimate the work, I fully agree with what you say, if does not come for free, for the "client configuration / tunning and so on", I would be more than happy to help.

Thanks for the quick answer.

vroldanbet commented 10 months ago

@shinji62 no need to apologize! It was a good opportunity to shed some light on what it takes to add a new datastore. We mantainers haven't done a good job at clarifying what it takes to support a new database technology in SpiceDB.

We are certainly keeping an eye on database technologies that align with SpiceDB requirements (strong consistency, global distribution, horizontal scalability and exposed MVCC semantics). I'm not familiar with how Yugabyte supports these, do you have some insight?

shinji62 commented 10 months ago

@vroldanbet let me respond to your question

Overall YugabyteDB is a Distributed SQL database which provide 2 API one been a Postgresql compatible (almost 100%, we use the psql codebase) and one cassandra compatible in top of a distributed storage which handle automatic sharding, distributed transaction and so on.., Ydb is a CP database with strong HA.

strong consistency: This is one of the core, are we are CP Db we do have strong consistency and transactional support for the Postgresql API and the Cassandra one.

global distribution: Not sure what you mean by global distribution, but we do have geo replication and geo distribution, for example pinning certain data to region for compliance for example.

horizontal scalability: Ydb is share nothing and distributed DB so scale-out is one of the core feature as well

exposed MVCC semantics: Not sure what you mean by that.

vroldanbet commented 10 months ago

@shinji62

strong consistency: This is one of the core, are we are CP Db we do have strong consistency and transactional support for the Postgresql API and the Cassandra one.

Sounds good. Just to make sure we are talking about the same thing because those terms tend to be overloaded: I'm referring to strict serializability isolation level and external consistency. I make emphasis because SpiceDB requires this, and for example, CockroachDB, being an open-source implementation of Spanner, has some caveats on their isolation/consistency guarantees that are problematic for SpiceDB.

global distribution: Not sure what you mean by global distribution, but we do have geo replication and geo distribution, for example pinning certain data to region for compliance for example.

My apologies for using such a vague term. Geo replication and geo distribution does not necessarily describe with precision how the system behaves in light of writes distributed across the globe.

SpiceDB requires a database capable of providing the strongest levels of isolation and consistency when distributed globally. It means providing the same strong guarantees but with multiple reads and writes distributed worldwide. Single-node architectures like Postgres/MySQL are out of the equation here, and if my recollection of Cassandra is correct, reaching consensus of N nodes distributed around the world will likely be very slow, but I'm not sure that's how YugaByte works. Not that Spanner and CockroachDB are blazing fast in this regard, but they are designed with that use-case in mind.

exposed MVCC semantics: Not sure what you mean by that.

The database should be able to return query results at a given snapshot. This is fundamental to SpiceDB's bounded staleness, which is the trick to make it scale. While this is not a _hard requirement, the bookkeeping needed to layer a snapshotting system on top of the database is additional overhead.

shinji62 commented 9 months ago

Thanks @vroldanbet

Yugabyte support the same isolation as PG. Now for the external consistency or I guess Linearizability we do support for single-row but for not for multi-row transactions which support three isolation levels: Serializable, Snapshot (also known as repeatable read), and Read Committed isolation.

Reaching consensus of N nodes distributed around the world will likely be very slow, but I'm not sure that's how YugaByte works. Not that Spanner and CockroachDB are blazing fast in this regard, but they are designed with that use-case in mind.

I think that quite similar to cockroach or spanner, mostly depends on the latency between the node.

vroldanbet commented 9 months ago

Thanks for the info @shinji62 🙏🏻

jsco2t commented 6 months ago

Would also like to see this supported. We were looking to use Authzed with Yugabyte as the database provider.

sameer-m-dev commented 4 months ago

Hi team, are we planning on adding Yugabyte support for SpiceDB? We are also looking to use Authzed with Yugabyte as the database provider

vroldanbet commented 4 months ago

@sameer-m-dev there are no plans to add yugabyte support, and I think it wouldn't be feasible without multi-row linearizable transactions.

FranckPachot commented 4 months ago

Hi, developer advocate for YugabyteDB here, trying to understand the consistency requirement.

"The database should be able to return query results at a given snapshot": YugabyteDB fully supports this through MVCC and SQL isolation levels
"multi-row linearizable transactions": YugabyteDB guarantees linearizability for single-row transactions. However, it doesn't provide "strong serializability" aka what Spanner calls "external consistency", for transactions that haven't exchanged messages between the concerned nodes. This is due to clock skew, where physical clock skew can occur, and logical clocks are synchronized only when exchanging messages (lamport). It is more a hardware limitation. We cannot do the same as Spanner (wait for may clock skew for each write) without atomic clocks.

The anomaly can happen when A and B run transactions without any objects in common, and A and B also communicate outside of the database to know which one committed before - this may be different from the commit order in the database. Is that a problem for SpiceDB? If it is, we have to wait that atomic clocks are more popular

vroldanbet commented 4 months ago

Hey @FranckPachot, thanks for chiming.

Yes, external consistency is a requirement for SpiceDB, which follows the Zanzibar paper which was built on top of Spanner. Changes to unrelated parts of the graph can impact how results are computed, and thus wouldn't guarantee the system addresses what the Zanzibar paper refers to as the "new enemy problem". We've engineered the implementation of SpiceDB on top of each datastore taking into account which consistency guarantees does the database offer.

FYI CockroachDB does support external consistency, and they do it without atomic clocks, so I don't think atomic clocks are a prerequisite - they certainly help though. CockroachDB's implementation is limited though to rows in the same range, so we had to come up with some clever tricks to workaround it.

FranckPachot commented 4 months ago

Ok, thanks. The tricks for CockroachDB should work the same for YugabyteDB. "Same range" means single-shard.

vroldanbet commented 4 months ago

@FranckPachot does that mean that Yugabyte supports external consistency for one or more rows as long as they fall under the same shard? (Which makes sense)

FranckPachot commented 4 months ago

@vroldanbet, I checked with colleagues to be sure. Yes, it doesn't violate external consistency if they fall into the same shard. And falling into the same shard is guaranteed with hash sharding. For example with

create table t ( a int, b int, c int, primary key ( a hash, b asc) );

all rows with the same values for a will go into the same shard.

authzed / spicedb

Implement a YugaByteDB datastore #427