arunramachandran15 / duckdb-sheets

0 stars 0 forks source link

Implement duckdb manager using python. (Schema design and flow included) #4

Open arunramachandran15 opened 2 months ago

arunramachandran15 commented 2 months ago

Since DuckDB works best with python, we will implement the duckDB manager for serializeing the reads and writes using python.

Duckdb manager will receive the inputs (writes) from Redis (Elastic cache)

For each initial load of the writes, the entire sheet data has to be sent to the client and will be pushed via redis for the nodejs to recieve and push it via socket to the Svelte app.

Operations to be handled

  1. Create new duckdbfile for each new sheet creation in UI
  2. Create new table for each new tab created with in a sheet
  3. Insert/update row (to be specific each cell is mapped to a row with row_id, col_id and value)
  4. Read the entire table for initial loading of a tab in a sheet.
  5. Delete a duckdb file on sheet deletion
  6. Delete a table on tab deletion

@vcanaran

vcanaran commented 2 months ago

when receiving data from Elastic Cache from the front-end "command" websocket interface you are thinking Pub/Sub not cache Frontend --> websocket::command --> EC:pub/sub --> duckdb database manager ---> duckdb

When you are sending data back to Elastic Cache for the front-end to receive via a broad cast websocket called "eventstream" then you are thinking Cache.

arunramachandran15 commented 2 months ago

https://github.com/arunramachandran15/duckdb-sheets/tree/dbmanager

Have implemented the intitial simple duckdbmanager (dbmanager branch). Using local redis for development instead of elastic cache before the AWS setup. Still its work in-progress. Pushed it to the repo for progress update.

Instead of having seperate table for each tab, is this schema good enough to represent all tabs in a sheet within a duckdb file. As sheets can have dynamic number of columns.

    `self.conn.execute('''
        CREATE TABLE IF NOT EXISTS tabs (
            tab_id INTEGER DEFAULT nextval('id_seq_tabs') PRIMARY KEY,
            tab_name  VARCHAR(255) UNIQUE NOT NULL,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        );
    ''')`

    `self.conn.execute('''
        CREATE TABLE IF NOT EXISTS cells (
            cell_id INT DEFAULT nextval('id_seq_cells') PRIMARY KEY,
            tab_id INT REFERENCES tabs(tab_id),
            row_index INT NOT NULL,
            column_index INT NOT NULL,
            value TEXT,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            UNIQUE (tab_id, row_index, column_index)
        );
    ''')`