embrio-tech / centrifuge-development

An overarching composer repository to run centrifuge services in development.
GNU Lesser General Public License v2.1
3 stars 0 forks source link

POD indexer solution design #16

Closed sandford-ch closed 1 year ago

sandford-ch commented 1 year ago

User Story

As pool issuer or pool investor
I want to have insights on the loans in a pool
so that I can make educated strategy decisions.

Challenges with current setup

Acceptance criteria

Data Structure

erDiagram
    AssetTemplate {

    }
    AssetTemplate ||--o{ Attribute : "describes"

    Loan {
        metadata ipfs "public data"
        nft id "private data"
    }
    Loan }o--|| AssetTemplate : "has"
    Loan }|--o{ AttributeValue : "has"

    Attribute {
        public boolean
        source anyof "chain | ipfs | pod"
        type anyof "nominal, ordinal, discrete, continuous"
    }
    Attribute ||--o{ AttributeValue : "represents"

    AttributeValue{
        state anyof "string | number | ...?"
    }

    Pool {

    }
    Pool ||--o{ Loan : "contains"
    Pool ||..o{ AssetTemplate : "allows"

Sample AssetTemplate

{
  "name": "Mortgage",
  "options": {
    "assetClasses": [],
    "loanTypes": [],
    "description": false,
    "image": false
  },
  "sections": [
    {
      "name": "Mortgage",
      "public": true,
      "attributes": [
        {
          "label": "Mortgage amount",
          "type": "currency",
          "currencySymbol": "USD",
          "currencyDecimals": 18
        },
        {
          "label": "LTV",
          "type": "percentage"
        },
        {
          "label": "Interest rate",
          "type": "percentage"
        },
        {
          "label": "Due date",
          "type": "timestamp"
        }
      ]
    },
    {
      "name": "Property",
      "public": true,
      "attributes": [
        {
          "label": "Building type",
          "type": "string",
          "displayType": "single-select",
          "options": [
            "A",
            "B"
          ]
        },
        {
          "label": "Date build",
          "type": "timestamp"
        },
        {
          "label": "Appraisal",
          "type": "currency",
          "currencySymbol": "USD",
          "currencyDecimals": 18
        }
      ]
    },
    {
      "name": "Address",
      "public": false,
      "attributes": [
        {
          "label": "Line 1",
          "type": "string"
        },
        {
          "label": "Line 2",
          "type": "string"
        },
        {
          "label": "Zipcode",
          "type": "string"
        },
        {
          "label": "City",
          "type": "string"
        },
        {
          "label": "State",
          "type": "string",
          "displayType": "single-select",
          "options": [
            "AL",
            "AK",
            "AS",
            "..."
          ]
        },
        {
          "label": "Country",
          "type": "string",
          "displayType": "single-select",
          "options": [
            "AF",
            "AX",
            "AL",
            "..."
          ]
        }
      ]
    }
  ]
}
tibohei commented 1 year ago

Relevant design decisions

🚧 work in progress 🚧

This comment is a collection of design questions and decisions which we need to address.

❓ Compute or aggregate values on index or on request?

image

❓ Store each loan attribute value separately vs. store all attributes in one DB document/entry?

This decision has mainly an impact on the aggregation logic and performance and depends also on the chosen DB technology.

❓ Dynamic and flexible loan attributes vs. static/hardcoded/predefined at indexing?

This design question has mainly an impact on the query format.

# query attributes by name, for example "fico", "city"
query GetLoanAttributes($loanId: String, $attributeNames: [String]! ) {
    attribues(filter: { loan: $loanId, name: { _in: $attributeNames } }) {
      nodes
    }
}

vs.

# query attributes by graph fields
query GetLoanWithAttributes($loanId: String ) {
    loan(id: $loanId) {
      fico
      city
    }
}

❓ Extendable permissions and roles vs. binary access control (e.g. private & public)?

For now we probably need to distinguish between

Thus, a binary access control mechanism would be sufficient for now. But, at a later point in time this might change. What about

Therefore, we might want to consider a more granular access control mechanism such a RBAC or even HRBAC.

❓ Fetch chain data from subquery or reindex chain?

filo87 commented 1 year ago

Further Questions/Remarks:

filo87 commented 1 year ago

Predefined Aggregations

On an asset template we could predefine some aggregations like this

{
  "...": "...",
  "aggregations": [
    {
      "key": "test1",
      "label": "Test Value 1",
      "public": true,
      "pipeline": [
        { "operator": ["<argument1>", "<argument2>"] },
        { "operator": ["<argument1>", "<argument2>"] }
      ]
    },
    {
      "key": "weightedAverage",
      "label": "Weighted Average",
      "public": true,
      "pipeline": [
        {
          "$group": {
            "_id": "weighted average",
            "numerator": { "$sum": { "$multiply": ["$price", "$quantity"] } },
            "denominator": { "$sum": "$quantity" }
          }
        },
        {
          "$project": {
            "average": { "$divide": ["$numerator", "$denominator"] }
          }
        }
      ]
    }
  ]
}

or provide predefined aggregation methods

{
  "...": "...",
  "aggregations": [
    {
      "key": "test1",
      "label": "Test Value 1",
      "public": true,
      "method": {
        "name": "nameOfAggregationMethod",
        "args": ["attr1", "attr2", 1000, {}]
      }
    },
    {
      "key": "weightedAverage",
      "label": "Weighted Average",
      "public": true,
      "method": {
        "name": "weightedAvg",
        "args": ["attr1", "attr2"]
      }
    }
  ]
}
tibohei commented 1 year ago

Suggestion on off-chain attribute data formats

Current Implementation

Template Example
{
  "name": "Example asset template",
  "options": {
    "assetClasses": [],
    "loanTypes": [],
    "description": true,
    "image": true
  },
  "sections": [
    {
      "name": "A public data section",
      "public": true,
      "attributes": [
        {
          "label": "Label 1",
          "type": "string"
        }
      ]
    },
    {
      "name": "A private data section",
      "public": false,
      "attributes": [
        {
          "label": "Label 2",
          "type": "string"
        }
      ]
    }
  ]
}
Data Example
{
  "name": "Loan-12",
  "description": "All I need is a coffee ",
  "image": "ipfs://ipfs/QmY2JxyZpQgph4vtSKcrQuamVFaR1QXNN43E1a52HZbFYa",
  "properties": {
    "_template": "QmVSJ7n2nFJAy91AhGNLt36yDFVg9mbg7Wiy7iwSJr2JKA",
    "label_1": "Test"
  }
}

Suggested Implementation

Template
{
  "name": "Example asset template",
  "options": {
    "assetClasses": [],
    "loanTypes": [],
    "description": true,
    "image": true
  },
  "attributes": [
    {
      "key": "key1",
      "label": "Label 1",
      "type": {
        "primitive": "string",
        "statistics": "categorical",
        "constructor": "String"
      },
      "input": {
        "type": "input"
      },
      "output": null,
      "public": true
    },
    {
      "key": "key2",
      "label": "Label 2",
      "type": {
        "primitive": "string",
        "statistics": "categorical",
        "constructor": "String"
      },
      "input": {
        "type": "input"
      },
      "output": null,
      "public": false
    },
    {
      "key": "key3",
      "label": "Label 3",
      "type": {
        "primitive": "string",
        "statistics": "continuous",
        "constructor": "Date"
      },
      "input": {
        "type": "datepicker",
        "format": "YYYY-MM"
      },
      "output": {
        "format": "YYYY-MM"
      },
      "public": false
    }
  ],
  "sections": [
    {
      "name": "First Section",
      "attributes": ["key1", "key2"]
    },
    {
      "name": "Second Section",
      "attributes": ["key3"]
    }
  ]
}
Data
{
  "name": "Loan-12",
  "description": "All I need is a coffee ",
  "image": "ipfs://ipfs/QmY2JxyZpQgph4vtSKcrQuamVFaR1QXNN43E1a52HZbFYa",
  "attributes": {
    "_template": "QmVSJ7n2nFJAy91AhGNLt36yDFVg9mbg7Wiy7iwSJr2JKA",
    "values": [
      {
        "key": "key1",
        "value": "This is my answer to 1"
      },
      {
        "key": "key2",
        "value": "This is my answer to 2"
      },
      {
        "key": "key3",
        "value": "2011-10-05T14:48:00.000Z"
      }
    ]
  }
}

or

{
  "name": "Loan-12",
  "description": "All I need is a coffee ",
  "image": "ipfs://ipfs/QmY2JxyZpQgph4vtSKcrQuamVFaR1QXNN43E1a52HZbFYa",
  "attributes": {
    "_template": "QmVSJ7n2nFJAy91AhGNLt36yDFVg9mbg7Wiy7iwSJr2JKA",
    "values": {
      "key1": "This is my answer to 1",
      "key2": "This is my answer to 2",
      "key3": "2011-10-05T14:48:00.000Z"
    }
  }
}
hieronx commented 1 year ago
  • Is there on chain a reference to the list of AssetTemplates for a given pool, other than in the metadata of single Loans?

See loanTemplates in the pool metadata: https://altair.mypinata.cloud/ipfs/QmbFvavtS3LknEDnL6bSkKpotqhm6LJikEGNFpfN95meTs

filo87 commented 1 year ago

MVP Data Model

erDiagram
    LoanTemplate {
    }

    SourceSpec {
        source anyof "ipfs | subql | chain | pod"
        objectId String
        lastFetchedAt Date
    }

    Loan {
        sources SourceSpec[]
    }
    Loan }o--|| LoanTemplate : "implements"

    DataFrame {
        source anyof "ipfs | subql | chain | pod"
        createdAt Datetime
        data Object
    }

    Loan ||--o{ DataFrame : "has"
filo87 commented 1 year ago

Generic Data Model

erDiagram

    Source {
        id ObjectId
        entity ObjectId
        type anyof "ipfs | subql | chain | pod"
        objectId String
        lastFetchedAt Date
    }

    Entity {
        id  ObjectId
        type anyof "Loan | LoanTemplate"
    }

    Frame {
        id ObjectId
        source ObjectId
        createdAt Datetime
        data Object
    }

    Source }o--|| Entity: ""

    Source ||--o{ Frame : ""
onnovisser commented 1 year ago

I changed the loan template to be more in line with the suggestions above. The only significant change being, changing the attributes from an array to an object. This way the template itself enforces that the keys are unique and make it a bit easier to look up attributes by their key.

type LoanTemplate = {
  name: string
  options: {
    assetClasses: string[]
    description: boolean
    image: boolean
  }
  attributes: Record<string, {
    label: string
    type: {
      primitive: 'string' | 'number'
      statistics: 'categorical' | 'ordinal' | 'continuous' | 'discrete'
      constructor: 'String' | 'Date' | 'Number'
    }
    input: (
      | {
          type: 'text' | 'textarea'
          maxLength?: number
        }
      | {
          type: 'single-select'
          options: string[] | { value: string, label: string }[]
        }
      | {
          type: 'date' | 'time' | 'datetime-local' | 'month' | 'week'
          min?: string
          max?: string
        }
      | {
          type: 'currency'
          symbol: string
          min?: number
          max?: number
        }
      | {
          type: 'number'
          unit?: string
          min?: number
          max?: number
        }
    ) & { placeholder?: string }
    output: {} | null
    public: boolean
  }>
  sections: {
    name: string
    attributes: string[]
  }[]
}

Let me know if this works for you @tibohei @filo87

filo87 commented 1 year ago

Looks good, i just have one concern about currency values... in js these are bigint fixed decimals... but for proper conversion we would require also their precision....

would it make sense to include it there? or would we fetch it from the Currency info on chain?

this is important because for optimal aggregation in mongodb these should be initialised with the Decimal128 data type... and also provide the correct decimal position.

What do you think? @onnovisser @tibohei

onnovisser commented 1 year ago

yes that was on my mind as well. we could do it that, by default, currency values will be stored as a decimal just as they are inputted, but add a decimals option:

{
  type: 'currency'
  symbol: string
  min?: number
  max?: number
  decimals?: number
}

if used it'll multiply the value by 10 ^ decimals and store it as an integer WDYT? @filo87 Maybe @offerijns has an idea too. I'm not sure what values we can expect. In my mind I expected it to just be values in USD, for example, with no relation to on-chain currencies