First, I'm going to agree that Checkpoint needs some ORM, so we can abstract some logic and optimisations that will potentially be useful to its users (for example, it will be helpful to implement this feature as well).

Now, concerning tracking mutations and being able to query the state at a block. I think for this to work correctly and be helpful; we will need to track more than inserts; we will also need to track updates and deletions of entities. Thinking more about the database structure and the logic to fetch entities at a specific block height will help determine this.

Here is are my thoughts on how we can implement this:

Tracking Entity State Plan

We will need to keep track of each entity's state after every mutation. This plan is taking some inspiration from how The Graph currently does. We can achieve this by tracking the block range for which an entity's state is valid and only creating a new entity record with an updated block range when there is a change to the entity.

I'll be using the Vote entity in your example to continue my description and assume it has a schema like:

type Vote {
  id: String!
  proposal: Int!
  choice: Int!
}

Currently, we generate an SQL table of similar structure to the entities graphql field, BUT if we want to track entity states at specific block heights, we should create a table for each entity like this:

Column	Type
id	VARCHAR NOT NULL
proposal	INT NOT NULL
choice	INT NOT NULL
_block_min	INT NOT NULL
_block_max	INT NULLABLE

The _block_min represents the earliest block for which this entity record is valid (inclusive), and the _block_max represents the last block for which this record is no more valid (exclusive) i.e. the entity record is valid from block _block_min to _block_max - 1.

_block_max can be null to signify that this particular entity record is valid for all block from _block_min to the latest block.

Changes to the primary key

This change to the table structure also means we will potentially insert multiple entity records with the same id value. The easiest fix will be to have a compound primary key that uses the id, _block_min and _block_max values. This compound primary key will ensure that we can insert multiple records with the same id with varying block ranges.

The primary issue with this approach is that it will be possible for a data writer to insert different entities with the same id at other blocks. Still, if we introduce an ORM and encourage entity creations through the ORM, we can have the ORM's create function perform a check to ensure that no entity with a null _block_max value exists with the same id before creating the entity.

ORM Design and Logic

We will need an ORM that will expose CRUD functionality to its users. The ORM will be used in both graphql resolvers and data writers.

For simplicity, we will only be creating functions to create or modify a single record at a time. Future iterations will plan to support multiple entity updates.

1. Creating an Entity

Creating an entity will Insert an entity record with the current block number set to _block_min and _block_max set to null. For example:

INSERT INTO `votes`(id, proposal, choice, _block_min, _block_max)
VALUES ($id, $proposal, $choice, $current_block, null)

This logic should also confirm that there isn't an existing record with the same id with a null _block_max .

2. Updating an Entity

Updating an entity means some data has changed. First, an existing record matching the specified id needs to have its _block_max set to the current_block and then a new entity record will be created for the current_block. For example:

UPDATE `votes` SET _block_max = ($current_block)
WHERE id = $id AND _block_max IS NULL

INSERT INTO `votes`(id, proposal, choice, _block_min, _block_max)
VALUES ($id, $proposal, $choice, $current_block, null)

3. Deleting an Entity

Deleting an entity means the data doesn't exist anymore at the block range. To reflect this we only need to set the _block_max value to current_block and that is it:

UPDATE `votes` SET _block_max = ($current_block)
WHERE id = $id AND _block_max IS NULL

4. Reading Entities

Querying entities will involve including a where clause based on the range of _block_min and _block_max.

Assuming we have a $target_block we would like to fetch an entity. Then an example query will look like:

SELECT ... from `votes`
WHERE ... AND (
    (_block_min <= $target_block AND _block_max > $target_block) OR 
    (_block_min <= $target_block AND _block_max IS NULL)
)

Proposed interface for ORM

interface EntityStore {
    // Creates a new entity 
    createEntity: (name: String, data: EntityRecord) => Promise<void>,

    // Update an existing entity based on id
    updateEntity: (name: String, id: SupportedFieldTypes, updatedData: Record<string, SupportedFieldTypes>) => Promise<void>,

    // delete an entity by id
    deleteEntity: (name: string, id: SupportedFieldTypes) => Promise<void>,

    // Fetch an entity by its id
    readEntity: (name: string, id: SupportedFieldTypes) => Promise<EntityRecord | null>,

    // Fetch multiple entities based on the whereClause. 
    // The params are for substitution to avoid sql injection.
    readEntities: (name: string, whereClause: string, whereParams: Record<string, SupportedFieldTypes>) => Promise<EntityRecord[]>
}

type SupportedFieldTypes = string | number;
type EntityRecord = {
    id: SupportedFieldTypes;
    [fields: string]?: SupportedFieldTypes;
};

An implementation of this EntityStore will be injected as arguments to Checkpoint writers and also as part of context for its resolvers.

Updating Generated GraphQL Queries

Finally, we will need to update the generated Query for entities. It should include querying the block number like this:

type Query {
    vote(id: String!, block: Int): Vote;
    votes(
        ...,
        where: WhereVote,
        block: In
}

type WhereVote {
    ...
}

checkpoint-labs / checkpoint

feat: lookback query at specific block number #61