Modification: Datacap Application Flow Tooling Update

Over the past few months, the Fil+ team has been refining our approach to monitoring datacap applications. We're thrilled to introduce a comprehensive update to our Fil+ tooling, designed to enhance and streamline the datacap application process. This article provides an in-depth look at the modifications we've implemented in this latest tooling upgrade and outlines the ways stakeholders can interact with the updated system. For a succinct summary of these advancements, watch this demo video.

Launch Timeline

This is the estimated launch schedule for the JSON tooling update:

Dec 1: initial production launch (all new issues created will point to a JSON file)
Dec 8: Updated documentation and opensourcing of all repos
Dec 17: Integration with filplus.storage
Q1 2024: Phase out old tooling
Q1 2024: integration with AC Bot
Q1 2024: modifications for Allocator and Metapathway Program Design

Components

The tooling upgrade introduces several key enhancements:

JSON Format: Transitioning from utilizing GitHub issues, application histories will now be archived in a JSON format for streamlined data management.
Application Flow: The application process is revamped to involve Pull Requests (PRs) for modifications to the JSON file, reflecting each stage of the application lifecycle.
Validation Scripts: To maintain data integrity, GitHub Actions are configured to execute validation scripts following each commit to the JSON file.
Frontend: The Fil+ registry's latest iteration will incorporate these changes, offering detailed insights into each application's status.
Backend: A revamped backend infrastructure is established to integrate all components and bolster the application process's resilience.
Updated SA Bot: The Subsequent Allocation (SA) bot is updated to align with the new application procedures, ensuring a cohesive ecosystem.

JSON format

The main component of the new application flow is the switch from storing datacap application flow information from github issues to JSON files. submits a datacap application. Information about the application will be stored in a JSON file with the following structure. This JSON file will be stored in the LDN folder of the filplus-json-tooling repo and updated by way of pull requests if updates need to be made. To learn about motivations for this update read issue 839.

{
  "applicationVersion": 1,
  "dataCapApplicationType": "da | ldn-v3 | e-fil",
  "projectId": 1,
  "datacapApplicant": "",
  "applicationInfo": {
    "coreInformation": {
      "Data Owner Name": "",
      "Data Owner Country/Region": "",
      "Data Owner Industry": "",
      "Website": 0,
      "Social Media": {
        "handle": "",
        "type": "Slack | Twitter | Facebook"
      },
      "What is your role related to the dataset": "",
      "Total amount of DataCap being requested": {
        "amount": 0,
        "unit": "GiB | TiB | PiB"
      },
      "Expected size of single dataset (one copy)": {
        "amount": "",
        "unit": "GiB | TiB | PiB"
      },
      "Number of replicas to store (minimum 4)": 0,
      "Weekly allocation of DataCap requested": {
        "amount": "",
        "unit": "GiB | TiB | PiB"
      },
      ""On-chain address (Note that you will not be able to change this in the future and that you should have a unique address for each LDN application)": "",
      "Data Type of Application": "Slingshot | Public, Open Dataset (Research/Non-Profit) | Public, Open Commercial/Enterprise | Private Commercial/Enterprise | Private Non-Profit / Social impact",
      "Custom multisig": ""
    },
    "projectDetails": {
      "Share a brief history of your project and organization": "",
      "Is this project associated with other projects/ecosystem stakeholders?": true,
      "If answered yes, what are the other projects/ecosystem stakeholders": ""
    },
    "useCaseDetails": {
      "Describe the data being stored onto Filecoin": "",
      "Where was the data currently stored in this dataset sourced from": "AWS Cloud | Google Cloud | Azure Cloud | My Own Storage Infra | other",
      "If you answered 'Other' in the previous question, enter the details here": "",
      "How do you plan to prepare the dataset": "IPFS | Lotus | Singularity | Graphsplit | other/custom tool",
      "If you answered 'other/custom tool' in the previous question, enter the details here": "",
      "Please share a sample of the data (a link to a file, an image, a table, etc., are good ways to do this.)": "",
      "Confirm that this is a public dataset that can be retrieved by anyone on the network (i.e., no specific permissions or access rights are required to view the data)": true,
      "If you chose not to confirm, what was the reason": "",
      "What is the expected retrieval frequency for this data": "Daily | Weekly | Monthly | Yearly | Sporadic | Never",
      "For how long do you plan to keep this dataset stored on Filecoin": "Less than a year | 1 to 1.5 years | 1.5 to 2 years | 2 to 3 years | More than 3 years | Permanently"
    },
    "datacapAllocationPlan": {
      "In which geographies do you plan on making storage deals": [

      ],
      "How will you be distributing your data to storage providers": "",
      "How do you plan to choose storage providers": "",
      "If you answered 'Other' in the previous question, what is the tool or platform you plan to use": "",
      "Please list the provider IDs and location of the storage providers you will be working with. Note that it is a requirement to list a minimum of 5 unique provider IDs, and that your client address will be verified against this list in the future": [{"providerID": "", "location": "",  "SPOrg",""}],
      "How do you plan to make deals to your storage providers": "",
      "If you answered 'Others/custom tool' in the previous question, enter the details here": "",
      "Can you confirm that you will follow the Fil+ guideline (Data owner should engage at least 4 SPs and no single SP ID should receive >30% of a client's allocated DataCap)": ""
    }
  },
  "applicationLifecycle": {
    "state": "submitted | ready to sign | start sign datacap | granted | total datacap reached | governance review needed | error", 
    "validatedTime": 0,
    "firstAllocationTime": 0,
    "isTrigered": false,
    "isActive": true,
    "timeOfNewState": 0
  },
  "dataCapAllocations": [
    {
      "dataCapTranche": {
        "trancheID": 0,
        "clientAddress": "f1...",
        "timeOfRequest": 0,
        "timeOfAllocation": 0,
        "notaryAddress": "",
        "allocationAmount": 0,
        "signers": [
          {
            "githubUsername:" "",
            "signingAddress": "",
            "timeOfSignature": 0,
            "messageCID": ""
          },
          {
            "githubUsername:" "",
            "signingAddress": "",
            "timeOfSignature": 0,
            "messageCID": ""
          }
        ],
        "pr": 0,
      }
    }
  ]
}

The JSON file structure has a few different parts

header information
- application version is the version of the JSON structure that is being used. This can change when the question template is changed or the format of datacap allocation is different.
- projectId is used to differentitate different applications from the same dataCapApplicant
applicationInfo contains information that the client submits when they are initially applying for DataCap. In the current application flow, this is information is contained in the issue header.
applicationLifecycle contains light information about datacap allocations
- state indicates the state of the application. This is analagous to issue labels in the current application flow
- isTriggered would indicate that the application is was triggered by a gov team member
- when isActive is true, the application will be shown in the frontend for notaries to sign. isActive is only set to true when a DataCap Tranche is in the process of being signed by notaries.
dataCapAllocations is the core part of the JSON structure. It holds all the historical information about datacap being allocated to this client. When a new DataCap allocation is initiated a dataCapTranche object is added to the dataCapAllocations array. This dataCapTranche object contains informatino including:
- trancheID unique identifier for the dataCap allocation
- notaryAddress the notary/ multisig that allocated the datacap
- signers an array with signer objects that indicate information about the signers of the datacap allocation tranche. In the LDN model this would contain information about the notaries that propose and approve the datacap allocation

Application flow

The "happy flow" of an application proceeds as follows:

Client can create an issue as usual through an issue template
- github action converts the issue into a JSON file
- A new comment will apear in the issue pointing to he newly associated JSON file
Application is submitted
- github actions validates the JSON
- When client is done editing core information they set state to submitted to acknowledge that they are ready for the application to be triggered
- Frontend detects the following then displays in front end for gov team members to trigger
  - an application with statesubmitted
  - while also checking if the validation script has checked the core information to be correct
Application gets triggered by a governance team member
- This changes isTriggered from false to true
- This changes isActive from false to true
- A dataCapTraches object is added to the dataCapTraches array complete with information about the allocation itself (eg. amount to be allocated, time it was created)
- State is changed to ready to sign
- When the frontend detects this new state, the application will be displayed there for notaries to be able to sign
Notary proposes the application
- The github actions create a multisig message on the notaries behalf
- The notaries information, signature and address are added to the empty dataCapTranche object
- state is changed to start sign datacap
- When the frontend detects this new state, the application will be displayed there for another notary to be able to sign
Notary approves the applicatiion
- The actions create creates a multisig message on the notaries behalf
- The notaries information, signature and address are added to the empty dataCapTranche object
The application PR with this JSON file is merged
- datacap is allocated to the client
- This changes state to granted
SSA bot detects that the client is <25% allocated and that isActive is true
- a new dataCapTranche object is added to the allocation array with new information
- This changes state to ready to sign
Once all the tranches have been allocated by SSA bot logic
- State is changed to total datacap reached
- isActive is set to false

Github Actions for Validation

Screen Shot 2023-11-10 at 12.43.59 AM

During each commit to a PR the following validation scripts are run by github action to check the "correctness" of the changes being made. These are meant to catch the edgecases for when the datacap application does not proceed regularly.

validate-flow checks that the most recent commit to the PR is permitted. Only notaries will be allowed to make a commit that changes state to ready to sign or start sign datacap, and only gov team member will be able to change the isTriggered field to true. Clients will not be permitted to edit the application at all unless it is not yet triggered and still in submitted state, othewise the validation will fail and the PR will not be able to merge.
validate-json checks the overal consistency of the json structure. This includes checking that all the fields exist and that the dataCapAllocations array is consistent with the state.
automerge-pr does a final check to see complete information in the dataCapAllocations then merges the PR when a tranche has all its signers.
validate-approval, validate-proposal, validate-trigger will each check if the commit is an approval, proposal, trigger

New Github registry (frontend)

The tooling update includes a new version of Fil+ registry with:

an updated interface
integration with all the new components
a page that allows stakeholders to get a detailed overview of a clients datacap application.

Backend service

The backend service is core to this tooling update as it connects all the components together to ensure that everything works seemlessly. There are two main purposes to the backend:

Be a central point for any other compoenent to interface with github APIs or the blockchain. Previously, different parts of tooling (SA Bot, frontend, validation scripts) had their own instance of functionality to interface with github and the blockchain. By having everything in on backend we are able to
- Cache this information to reduce the number of API calls
- Have consistent results between each component
Be a central point for JSON file manipulation When a client, notary, SA Bot, or gov team member interacts with the application through the front end the backend will be responsible for the following
- Calling the github API to fetch the JSON file
- Parsing the JSON file into an object
- Calling the correct function to edit the JSON object by adding all the missing information
- Converting the JSON object back into a JSON file
- Calling the github API again to create a new PR or make a commit to an existing PR
  - When this happens, github actions will run validation script that will in turn call the backend to check blockchain data
- Checking the blockchain
  - When the SA Bot adds a new tranche of datacap to the JSON file, it needs to ask the backend to check that the wallet of the client has <25% datacap.
  - When a notary wants to signs a datacap allocation, the backend checks that there a message on chain for that notary and that the notary is actually on the notary list
  - When a gov team member triggers an application it checks that the gov team member is actually permitted to do so

SA Bot update

The subsequent allocation (SA) bot is updated to integrate with the new JSON structure. Previously, it runs twice in one day, detects when a client has onboarded >75% of data from their last allocation, then it posts a comment to request more datacap. In this updated version, the SA bot will make a PR instead to add a dataCapTranche object to the dataCapAllocations array.

Integration for Allocators

By the end of this year, this JSON tooling will be updated to work with the new Meta Pathways structure of the Fil+ program. If you plan to be an allocator and want to use this tooling to keep track of client accounts, please reach out to anyone in the Fil+ team.

filecoin-project / notary-governance