dbpunk-labs / db3

a Lightweight, Permanent JSON document database
https://docs.db3.network/
Apache License 2.0
362 stars 44 forks source link

WIP: propose the document level ownership #271

Closed imotai closed 1 year ago

imotai commented 1 year ago

Motivations

  1. solve the problem of data exchange between accounts

Solution

image

jingchen2222 commented 1 year ago

Discussion Kick off

jingchen2222 commented 1 year ago

Design Doc: Document Database Storage

Background

Goal:

  1. solve the problem of data exchange between accounts
  2. record level permission management.
  3. design primary key generator (logic, size limit, order)
  4. design index key generator
  5. support CreateDocument/UpdateDocument/DeleteDocument API based on KV operations
  6. key optimization (future work)

Issue:

[propose the document level ownership #271](https://github.com/dbpunk-labs/db3/issues/271)

Structure

Database → Collection → Document

flowchart LR

A[Developers] --> B[database]
B --> C[colection 1]
C --> E[document 1]
C --> F[document 2]
C --> G[document 3]
B --> D[colection 2]
D --> H[document 4]
D --> I[document 5]
J[User] --> |"(Document, Signature, PublicKey)"|D
image image

ID

ID Type Size Description
DbId 20 DbId := DB3Address
TxId 32 transaction id
AccountId 20 AccountId := DB3Address
BlockID 8 Current block height
MutationID 4 Index of mutation in the current block.
PutEntryIdx 4 Index of document in current database mutation
PutEntryID 16 PutEntryID := BlockID + MutationID + PutEntryIdx
CollectionID 16 CollectionID := PutEntryID
DocumentEntryID 16 DocumentEntryID := PutEntryID
DocumentID 33 DocumenID := DocumentEntryType+CollectionID + DocumentKeyID
IndexFiledID 4 Index of the index filed in the collection
IndexID KeyBytes + 54 IndexId := IndexEntryType+CollectionID + IndexFiledID + KeyBytes + DocumentID

Document serialization

  1. rust https://docs.rs/bson/latest/bson/
  2. typescript https://github.com/mongodb/js-bson

Database & Collection Op

https://github.com/dbpunk-labs/db3/pull/297

Document OP

1. CreateDocument

sequenceDiagram
    Client->>+Node: 1.1 CreateDocument(document)
    Node->>+DocumentImpl:1.3 SubmitMutation(WRITE_DOCUMENT, document)
    Node-->>-Client: 1.4 Response(status, msg)
    DocumentImpl->>DocumentImpl: 2.1 documentId = GenerateDocumentId(blockId, autoId)
    DocumentImpl->>+KVStore: 2.2 InsertKV(documentId, txid, document)
    KVStore-->>-DocumentImpl: 2.3 Response(status, msg)
    DocumentImpl-->>DocumentImpl: 2.4 index_key_pairs = GetIndexKeyPairs(db, collection, document)
    loop Create Indexes: index_key_pairs.foreach
    DocumentImpl-->>DocumentImpl: 2.5.1 GenerateIndexId(db, collection, documentId, index, key)
    DocumentImpl->>+KVStore: 2.5.2 InsertKV(indexId, txid, documentId)
    KVStore-->>-DocumentImpl: 2.5.3 Response(status, msg)
    end
    DocumentImpl-->>-Node: 2.6 Response(status, msg, documentId)

2. UpdateDocument

sequenceDiagram
    Client->>+Node: 1.1 UpdateDocument(documentId, document)
    Node->>+DocumentImpl:1.2 SubmitMutation(UpdateDocument, documentId, document)
    Node-->>-Client: 1.3 Response(status, msg)
    DocumentImpl->>+KVStore: 2.1 GetKV(documentId)
    KVStore-->>-DocumentImpl: 2.2 Response(status, msg, address, old_document)
    DocumentImpl-->>DocumentImpl: 2.3 CheckOwnership(Signature, PublicKey, address)
    DocumentImpl-->>DocumentImpl: 2.4 modified_index_key_pairs = GetModifiedIndexKeyPairs(db, collection, old_document, document)
    DocumentImpl->>+KVStore: 2.5 Update documet: InsertKV(documentId, txid, document)
    KVStore-->>-DocumentImpl: 2.6 Response(status, msg)
    DocumentImpl->>DocumentImpl: 2.7 (indexes, keys) = CheckModifiedIndexs(document, index_keys)
    loop Update Indexes: modified_index_key_pairs.foreach(index, key)
    DocumentImpl-->>DocumentImpl: 2.8.1 indexId = GenerateIndexId(db, collection, index, key, documentId)
    DocumentImpl->>+KVStore: 2.8.2 InsertKV(indexId, txid, documentId)
    KVStore-->>-DocumentImpl: 2.8.3 Response(status, msg)
    end
    DocumentImpl-->>-Node: 2.9 Response(status, msg)

2. DeleteDocument

sequenceDiagram
    Client->>+Node: 1.1 DeleteDocument(documentId)
    Node->>+DocumentImpl:1.2 SubmitMutation(DeleteDocument, documentId)
    Node-->>-Client: 1.3 Response(status, msg)
    DocumentImpl->>+KVStore: 2.1 GetKV(documentId)
    KVStore-->>-DocumentImpl: 2.2 Response(status, msg, index_keys, txid)
    DocumentImpl-->>DocumentImpl: 2.3 CheckOwnership(Signature, PublicKey, txid)
    DocumentImpl->>+KVStore: 2.4 Delete documet: DeleteKV(documentId)
    KVStore-->>-DocumentImpl: 2.5 Response(status, msg)
    DocumentImpl-->>DocumentImpl: 2.6 index_key_pairs = GetIndexKeyPairs(db, collection, document)
    loop Delete Indexes:  index_key_pairs.foreach(index, key)
    DocumentImpl-->>DocumentImpl: 2.7.1 GenerateIndexId(db, collection, index, key, documentId)
    DocumentImpl->>+KVStore: 2.7.2 DeleteKV(indexId, txid)
    KVStore-->>-DocumentImpl: 2.7.3 Response(status, msg)
    end
    DocumentImpl-->>-Node: 2.8 Response(status, msg)

4. GetDocument

sequenceDiagram
    Client->>+Node: 1.1 GetDocumentById(documentId)
    Node->>+DocumentImpl:1.3 GetDocumentById(documentId)
    DocumentImpl->>+KVStore: 2.1 BatchGetKey(documentId)
    KVStore-->>-DocumentImpl: 2.2 Response(status, msg, document)
    DocumentImpl-->>-Node: 2.5 Response(status, msg)
    Node-->>-Client: 1.4 Response(status, msg)
    Client->>+Node: 2.1 GetDocuments(index, key)
    Node->>+DocumentImpl:2.2 GetDocuments(index, key)
    DocumentImpl->>DocumentImpl: 2.3 index_range = GenerateIndexRange(db, collection, index, key)
    DocumentImpl->>+KVStore: 2.3 GetRange(index_range)
    KVStore-->>-DocumentImpl: 2.4 Response(status, msg, documents)
    DocumentImpl-->>-Node: 2.5 Response(status, msg, documents)
    Node-->>-Client: 2.6 Response(status, msg, documents)

Implementation

jingchen2222 commented 1 year ago

Discussion

@imotai

1. What is the relationship between collection and documents? 1:N or N:M?

Regarding to the rpc proto definition mentioned here, one document might be under multiple collections? Will we define and implement collection in the same way?

2. Will we support Project, Namespace, Collection Group, in db3.

I find those concepts are used as defined in the google firestore.

imotai commented 1 year ago

Discussion

@imotai

1. What is the relationship between collection and documents? 1:N or N:M?

Regarding to the rpc proto definition mentioned here, one document might be under multiple collections? Will we define and implement collection in the same way?

2. Will we support Project, Namespace, Collection Group, in db3.

I find those concepts are used as defined in the google firestore.

  1. 1:n
  2. db3 just support database , collection and document
jingchen2222 commented 1 year ago

RPC Interface

DB3 will design and implement a set of APIs which are similar to google firestore so that the developers who are familiar to firestore can transfer to DB3 easily.

Reference:

DB3 Admin API

This API provides several administrative services for DB3 store.

P0

API Description
CreateDatabase Create database and linked with address
CreateCollection Define indexes
ListIndexes Lists composite indexes.
GetIndex Gets a composite index.
GetDatabase Gets information about a database.
GetField Gets the metadata and configuration for a Field.
ListFields Lists the field configuration and metadata for this database.

P1:

API Description
UpdateDatabase Updates a database.
UpdateField Updates a field configuration. Currently, field updates apply only to single field index configuration.
ExportDocuments Exports a copy of all or a subset of documents from DB3 store
ImportDocuments Imports documents into DB3 store. Existing documents with the same name are overwritten.
ListDatabases List all the databases in the project.
CreateIndex Creates a composite index. This returns a db.longrunninbg.Operation (similar to google.longrunning.Operation) which may be used to track the status of the creation. The metadata for the operation will be the type IndexOperationMetadata
DeleteIndex Deletes a composite index.

DB3 Document Op API

P0: API

API Description
CreateDocument Creates a new document.
GetDocument Gets a single document.
ListDocuments List documents
UpdateDocument Updates or inserts a document.
DeleteDocument Delete a document

P1: Document API

API Description
BatchGetDocuments Gets multiple documents. Documents returned by this method are not guaranteed to be returned in the same order that they were requested.
ListCollectionIds List documents
RunQuery Runs a query.
RunAggregationQuery Return the number of documents in table given a filter. SELECT COUNT(*) FROM ( SELECT * FROM k where a = true );
PartitionQuery Partitions a query by returning partition cursors that can be used to run the query in parallel.

DB3 Entity Proto

Entity Description
Database Gets multiple documents. Documents returned by this method are not guaranteed to be returned in the same order that they were requested.
Field Represents a single field in the database. Fields are grouped by their "Collection Group", which represent all collections in the database with the same id.
Index indexes enable simple and complex queries against documents in a database.
Document A db3 document. Must not exceed 1 MiB - 4 bytes.
jingchen2222 commented 1 year ago

Milestone and resource plan

Milestore Task Owner Status
Design POC Document Operation workflow design @cj Completed
Design POC DB3 admin opertation workflow design @wtz Completed
Feature Implementation Entity Proto implement P0 @wtz In Progress
Feature Implementation DB3 Admin P0 Op Implementation @wtz In Progress
Feature Implementation DB document P0 Op implementation @cj Not started
QA Unit test & integartion test @cj Not started
QA Benchmark @cj Not started
JS SDK API Implement JS SDK for db3 admin and document operation @zhaojun Not started
jingchen2222 commented 1 year ago

Implementation

Basic operation

Advance feature

jingchen2222 commented 1 year ago

Discussion

@imotai 🔔 👀 ♨️ I wanna introduce CollectionID (16 bytes) as a part of IndexID instead of using colecction_name varchar().

IndexId = |DbID(20)|CollectionId(16)|BlockID(16)|

imotai commented 1 year ago

Discussion

@imotai 🔔 👀 ♨️ I wanna introduce CollectionID (16 bytes) as a part of IndexID instead of using colecction_name varchar().

IndexId = |DbID(20)|CollectionId(16)|BlockID(16)|

agree

jingchen2222 commented 1 year ago

Propose index encode solution

Goal

Reference

Further work will be taken care by https://github.com/dbpunk-labs/db3/issues/321