bcc-code / developer.bcc.no

BCC Developer Portal with technical documentation and other resources for developers
https://developer.bcc.no/
5 stars 5 forks source link

[ADR] Core APIs database technology #39

Open StevenMalaihollo opened 2 years ago

StevenMalaihollo commented 2 years ago

There are 2 types or databases that are considered:

  1. Relational
  2. Document
    • MongoDB
    • Firestore
    • CosmosDB
    • ArangoDB

Considerations

  1. Costs
  2. Security
  3. Governance (offboarding of team members)
  4. Scaling
  5. Community
  6. Learning curve
  7. Redundancy
  8. Development speed (migrations, containerized)
  9. Data integrity
  10. Performance
  11. Querying features

Database Comparison

Relational

Pros

Cons

Document

Pros

Cons

Summary

After lining up the considerations for Relational and Document DBs we come to our final candidates:

  1. Azure SQL Database serverless
  2. MongoDB (Serverless is in preview and not governed)
  3. CosmosDB with Mongo API (Not full support of mongo API, weak data integrity)
  4. Firestore (Weak data integrity, limited querying)
  5. ArangoDB (Not serverless, small community)
  6. CockroachDB, PlanetScale (Not mainstream, lacks governance)
  7. Google SQL

Decision

  1. For the source of truth APIs we're going to use Google Cloud SQL
    1. Mainstream solution
    2. Forces good data integrity
  2. For query API we're suggest going for a document database with strong search and querying capabilities
    1. MongoDB looks like a promising option, but we need more research to be sure
    2. Other options include ArangoDB, Elastic Search

Consequences

Alternatives

Document databases where not chosen for the source of truth APIs because;

CockroachDB / PlanetScale

github-actions[bot] commented 2 years ago

Remember that ADRs are publicly available hence do not include any confidential information in the issue description! To read more about ADR please refer to documentation.

piotrczyz commented 2 years ago

I'm quite surprise over CosmoDb with Mongo API https://docs.microsoft.com/en-us/azure/cosmos-db/mongodb/mongodb-introduction. Maybe you can consider that for you APIs.

Why wouldn't you use Document database as a source of truth? @StevenMalaihollo

JakubC-projects commented 2 years ago

I'm quite surprise over CosmoDb with Mongo API https://docs.microsoft.com/en-us/azure/cosmos-db/mongodb/mongodb-introduction. Maybe you can consider that for you APIs.

Why wouldn't you use Document database as a source of truth? @StevenMalaihollo

CosmosDB wasn't chosen because it doesn't fully support Mongo API, notably lacks ability to define a collection schema. We feel that, for those basic APIs, data integrity is a top priority, so missing that is a big negative. However if it turns out that the serverless sql doesn't scale well enough, we are ready to switch to a Document database

andreasgangso commented 2 years ago

Use postgres on azure

andreasgangso commented 2 years ago

Why not google sql?

piotrczyz commented 2 years ago

Can we do the anaysis here? https://github.com/bcc-code/bcc-code.github.io/issues/7

andreasgangso commented 2 years ago

Likely to be more performant for simple read/write operations

I think this is false

image

https://www.enterprisedb.com/news/new-benchmarks-show-postgres-dominating-mongodb-varied-workloads

PostgreSQL 11 was found to be faster than MongoDB 4.0 in almost every benchmark. Throughput was higher, ranging from dozens of percent points up to one and even two orders of magnitude on some benchmarks. Latency, when measured by the benchmark, was also lower on PostgreSQL.

https://info.enterprisedb.com/rs/069-ALB-339/images/PostgreSQL_MongoDB_Benchmark-WhitepaperFinal.pdf

StevenMalaihollo commented 2 years ago

Likely to be more performant for simple read/write operations

I think this is false

image

https://www.enterprisedb.com/news/new-benchmarks-show-postgres-dominating-mongodb-varied-workloads

PostgreSQL 11 was found to be faster than MongoDB 4.0 in almost every benchmark. Throughput was higher, ranging from dozens of percent points up to one and even two orders of magnitude on some benchmarks. Latency, when measured by the benchmark, was also lower on PostgreSQL.

https://info.enterprisedb.com/rs/069-ALB-339/images/PostgreSQL_MongoDB_Benchmark-WhitepaperFinal.pdf

Interesting, our source for this was this whitepaper, it compares MySQL instead of PostgreSQL with MongoDB:

image link

JakubC-projects commented 2 years ago

Why not google sql?

After looking at Azure Serverless SQL pricing model, it isn't really that good, and the SQL flavor is Microsoft specific. We would like to use a mainstream SQL flavor (like Postgres or MySQL). Therefore we're going to try GCP's SQL offer (probably postgres flavor).

u12206050 commented 2 years ago

I've been using GCP Sql mysql flavor and am very satisfied with it. It does cost bit for an instance, but I can run multiple databases on the same instance.

In production I've seen and worked on mysql databases that are at least 10 times BCC's size in terms of traffic and customers, so as a technology I think it can work very well as the source of truth.

Other databases can be synced/indexed from the source of truth for improved performance.

Eg. State&transactions stores: document database (firestore, mongodb) Search: elasticseaech or dedicated search service Cache: redis database Audit and Logging: elastic or big query Relations heavy queries: neo4j (graph database)

On Fri, 1 Apr 2022, 11:26 Jakub Czyż, @.***> wrote:

Why not google sql?

After looking at Azure Serverless SQL pricing model, it isn't really that good, and the SQL flavor is Microsoft specific. We would like to use a mainstream SQL flavor (like Postgres or MySQL). Therefore we're going to try GCP's SQL offer (probably postgres flavor).

— Reply to this email directly, view it on GitHub https://github.com/bcc-code/bcc-code.github.io/issues/39#issuecomment-1085673395, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSVMWUORGEAYUCFJIVAVADVC26NLANCNFSM5SFPTPJQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>