filecoin-project / notary-governance

114 stars 58 forks source link

Notary Allocator Application: Kernelogic Fil+ Datacap Allocator #990

Closed kernelogic closed 6 months ago

kernelogic commented 10 months ago

v5 Notary Allocator Application

  1. Notary Allocator Pathway Name: Kernelogic Fil+ Datacap Allocator (KFDA)
  2. Organization Name: Kernelogic
  3. On-chain address for Allocator: f1yvo2nutvy6a4ortrvv2tguhpsi64a7fgrc42owy
  4. Country of Operation: Canada
  5. Region of Operation: North America, Asia
  6. Type of Allocator, diligence process: Automated and Market based
  7. DataCap requested for allocator for 12 months of activity: 250 PiB
mjroddy commented 10 months ago

Hey @kernelogic help me understand the new allocator system via this application..

Type of allocator, diligence process: Automated and Market based

What does this actually involve here? What part is automated and how is it market based?

What is the distribution of the 250 PiB potentially look like?

kernelogic commented 10 months ago

HI, the details will be submitted to airtable form according to procedure. Then I guess FIL+ team will share the info back to this issue. Stay tuned.

cryptoAmandaL commented 10 months ago

Hi @kernelogic , So you need to create a post on GitHub before applying for allocator, is that right? thanks

kernelogic commented 10 months ago

Yes, create a issue like this one. Then submit airtable form to refer the issue number.

The-Wayvy commented 10 months ago

Is it 250 PiB of physical capacity or 250 PiB of hash power?

kernelogic commented 10 months ago

250PiB of datacap. 2.5 EiB of QAP. It is just an estimate.

MegaFil commented 10 months ago

It sounds like fraud implementation requires the cooperation of at least two allocators before, but in the future it will only require one.

ghost commented 8 months ago

Providing the information shared in Application for Public Review/Comment.

Basic Info

1. Notary Allocator Pathway Name This can be your name, or the name of your pathway/program. For example "E-Fil+" • Kernelogic Fil+ Datacap Allocator

2. Organization • Kernelogic

3. On-chain address for Allocator Provide a NEW unique address. During ratification, you will need to initialize this address on-chain. • f1yvo2nutvy6a4ortrvv2tguhpsi64a7fgrc42owy

4. Country of Operation Where your organization is legally based. • Canada 
 5. Region of Operation What region will you serve? • North America

6. Type Of Allocator What is your overall diligence process? Automated (programmatic), Market-based, or Manual (human-in-the-loop at some phase). Initial allocations to these pathways will be capped. • Automatic

7. DataCap requested for allocator for 12 months of activity This should be an estimate of overall expected activity. Estimate the total amount of DataCap you will be distributing to clients in 12 months, in TiB or PiB. ▪ 250 PiB

8. Is your allocator providing a unique, new, or diverse pathway to DataCap? How does this allocator differentiate itself from other applicants, new or existing? Please explain the unique aspects of your proposed pathway.**

•   KFDA's approach includes the following key features:
•   1. Automation: Eliminates the need for manual verification.
•   2. Permissionless: Does not require any specific entity or geographic location, though the latter is encouraged.
•   3. Trustless: Omits manual notary proposal or approval steps.
•   4. Network Growth: Aims for a KPI of 50 PiB per month in datacap allocation and onboarding.
•   5. Paying Client Requirement: Clients must pay in Fil to obtain datacap, aligning with the one of FF's important objective, albeit through a different approach.

*9. As a member in the Filecoin Community, I acknowledge that I must adhere to the Community Code of Conduct, as well other End User License Agreements for accessing various tools and services, such as GitHub and Slack. Additionally, I will adhere to all local & regional laws & regulations that may relate to my role as a business partner, organization, notary, or other operating entity. You can read the Filecoin Code of Conduct here: https://github.com/filecoin-project/community/blob/master/CODE_OF_CONDUCT.md**

•   Acknowledge
ghost commented 8 months ago

Client Diligence

10. Who are your target clients? • Individuals learning about Filecoin • Small-scale developers or data owners • Enterprise Data Clients • Other (specified above) • Enterprise Data Clients

11. Describe in as much detail as possible how you will perform due diligence on clients. • KFDA aims to streamline the current, labor-intensive manual process by automating as much as possible in a permissionless manner. • Upon registration with KFDA, clients are required to verify their email address (social login is also an option). • Clients then submit a dataset onboarding request, which can be an AWS open dataset or their private dataset. Provided KFDA has access to the metadata (file list and size), it will conduct automated content verification throughout the lifecycle of the DC distribution. • Pricing starts at a basic rate, but significant discounts are offered for various decentralization characteristics of clients to encourage datacap acquisition. These discounts, which can be combined, include: • 5% off for using an AWS public dataset. • 5% off if the dataset is not excessively popular. • 5% off for collaborating with SPs utilizing green energy. • 5% off for working with new SPs. • 5% off for working with established SPs. • 5% off for collaborating with SPs in specific regions. • 5% off for working with SPs that offer higher retrieval speeds. • 5% off for repeat clients.

12. Please specify how many questions you’ll ask, and provide a brief overview of the questions. If you have a form, template, or other existing resource, provide the link. • KFDA, emphasizing efficiency, will ask clients only a few key questions: the dataset name, the owner, their role, the industry, and to upload the dataset metadata. This last requirement is waived if it's an S3 open dataset.

13. Will you use a 3rd-party "Know your client" (KYC) service? Provide as much detail about the service, what questions and regions they cover, and how the data will be integrated with your pathway. • No. Only email verification or social login. KFDA wants to focus on permissonless.

14. Can any client apply to your pathway, or will you be closed to only your own internal clients? (eg: bizdev or self-referral) For example, a 'white label' service where you apply directly on behalf of your clients. • KFDA is open to any clients who wish to onboard data that is publicly retrievable and verifiable.

15. How do you plan to track the rate at which DataCap is being distributed to your clients? • No tracking of the rate. Clients who passed checks need to pay Fil to get DC allocation to prevent hoarding.

ghost commented 8 months ago

Data Diligence

16. As an operating entity in the Filecoin Community, you are required to follow all local & regional regulations relating to any data, digital and otherwise. This may include PII and data deletion requirements, as well as the storing, transmitting, or accessing of data.  • acknowledge

17. What type(s) of data would be applicable for your pathway? • Public Open Dataset

*18. How will you verify a client’s data ownership? Will you use 3rd-party KYB (know your business) service to verify enterprise clients? Provide as much detail as possible about how you will confirm the provenance of claimed data, such as the type of KYB service.** • KFDA operates on a completely permissionless basis, placing the legal responsibility on the client.

19.  How will you ensure the data meets local & regional legal requirements? • KFDA operates on a completely permissionless basis, placing the legal responsibility on the client.

20. What types of data preparation will you support or require? Will you provide specific data prep services or integrations? Will you support or require things like sharding of data? • Data must not be archived inside sectors. Kernelogic will offer best-effort consultation for onboarding processes.

*21. What tools or methodology will you use to sample and verify the data aligns with your pathway? How will you confirm the data matches what a client claims, both in type and preparation requirements? How will you prevent sector-size abuse, such as sector padding?** • The process involves comparing the data with its metadata, including size, to ensure that the sectors store exactly what they claim to. This comparison aims to prevent padding abuse within a reasonable percentage, as well as avoiding duplications and enlargements of the data.

ghost commented 8 months ago

Data Distribution

22. How many replicas will you require to meet programmatic requirements for distribution? • 2 +

23. What geographic or regional distribution will you require? • There are no obligatory conditions for clients. However, clients desiring lower costs can benefit from more discounts by increasing their data distribution. Additionally, KFDA plans to organize events offering significant discounts for specific regions to encourage the development of regional Storage Providers (SPs). For instance, to promote more African SPs, KFDA could offer a substantial discount to clients working with SPs in that region.

24. How many Storage Provider owner/operators will you require to meet programmatic requirements for distribution? • 2 +

25. Do you require equal percentage distribution for your clients to their chosen SPs? Will you require preliminary SP distribution plans from the client before allocating any DataCap? • KFDA operates on a market-based adjustment principle without stringent requirements. Clients can voluntarily opt to work with green energy Storage Providers (SPs) and indicate this preference by providing supporting materials. This approach allows clients to align with sustainable practices and potentially benefit from associated discounts.

26. What tooling will you use to verify client deal-making distribution? For example, the existing datacapstats.io tooling and/or the CID checker bot • KFDA has the ability to develop its own CID checker bot. However if existing bots have good enough APIs KFDA might opt to utilize existing bots.

27. How will clients meet SP distribution requirements? Will you use software (such as a data-clearinghouse) to programmatically choose and distribute data from clients to SPs? • No requirement. Market (discount) based adjustment.

28. As an allocator, do you support clients that engage in deal-making with SPs utilizing a VPN? • Yes, Kernelogic consistently expressed support to reasonable use of VPNs, recognizing them as an integral part of enterprise firewall practices.

ghost commented 8 months ago

DataCap Allocation Strategy

29. Will you use standardized DataCap allocations to clients? • No

**Either in a single size (eg only ever giving 64GiB) or standard scale

  1. Allocation Tranche Schedule to clients:** • First: 100 TiB minimal • Second: 300 TiB maximal • Third: 1 PiB maximal • Fourth: 2 PiB maximal • Max per client overall: No limit as long as the metadata fits replication goal.

31. Will you use programmatic or software based allocations? • Yes

32. What tooling will you use to construct messages and send allocations to clients? For example, the existing notary registry tooling at https://filplus.fil.org/#/ • filplus.storage for initial application intake. Lotus binary to issue datacap. Lassie to verify data content.


33. Describe the process for granting additional DataCap to previously verified clients. Will you use certain criteria for starting a subsequent allocation request, such as the percentage of DataCap remaining? Will you use open-source tooling such as the Subsequent Allocation (SA) bot, or other automated tooling? • KFDA possesses the capability to develop its own DataCap (DC) request calculator. However, if existing bots offer sufficiently robust APIs, KFDA might choose to utilize these existing bots instead. Clients are provided with the flexibility to upload metadata information in tranches. This approach ensures that clients are not required to prepare all the CARs at the outset. They have the liberty to decide the size of the next tranche, adopting an à la carte style approach to data management and submission.

ghost commented 8 months ago

Tools And Bookkeeping

*34. Describe in as much detail as possible the tools used for: • client discoverability & applications • due diligence & investigation • bookkeeping • on-chain message construction • client deal-making behavior • tracking overall allocator health • dispute discussion & resolution • community updates & comms Address all the tools & software platforms in your process.** • KFDA is developing a comprehensive multi-tenant web platform, with a prototype already in place. This platform will provide each client with a dedicated workspace where they can invite collaborators, create a dataset onboarding plan, upload metadata, and monitor the data verification process. Additionally, it will enable clients to apply for subsequent allocation tranches and submit disputes, among other functionalities, thereby streamlining the data management and verification process for each client.

35. Will you use open-source tooling from the Fil+ team? How much and which tools will you utilize? If not, please specify the tools you'll use and explain the process (such as GitHub repo, Google spreadsheet, MongoDB). • The multi-tenant web platform will be built with NodeJS and Postgres DB, Mongo DB will also be utilized to store metadata. Lotus binary will be used to check balance, monitor payment transactions and issue datacaps. Lassie will be used to perform retrieval sampling.

36. Where will you keep your records for bookkeeping? How will you maintain transparency in your allocation decisions?  • Clients will be able to view every decision making logs in KFDA platform. Upon request, decision making logs can also be exported and provided to Fil+ team. Public page will also be available for total datasets onboarded, total DC distributed, for transparency.

ghost commented 8 months ago

Risk Mitigation, Auditing, Compliance

37. Describe your proposed compliance check mechanisms for your own clients.  • KFDA plans to set up servers in multiple regions to conduct retrieval sampling. This process involves inspecting the content of sectors to determine the files present and their sizes, and then comparing this information with the metadata previously uploaded by clients before allocation. Clients who comply with the system and exhibit good behavior will receive substantial discounts on DataCap (DC) allocation fees, in accordance with predefined rules. This approach essentially makes the process market-based. In cases where the onboarded data does not match the provided metadata or is completely unretrievable, KFDA will pause further allocations. Resolution will require a manual review, involving a consensus-reaching dialogue between the clients and KFDA operators. This measure ensures data integrity and compliance with KFDA standards.

38. Describe your process for handling disputes. Highlight response times, transparency, and accountability mechanisms. • KFDA will maintain a GitHub repository specifically for submitting dispute issues. In the event of a dispute, KFDA commits to providing a complete audit trail of its decision-making process and is open to constructive suggestions to enhance its verification procedures. Furthermore, in cases where a datacap revocation occurs by the FIL+ team, KFDA will ensure a refund of Fil for the unused portion of the datacap. This policy demonstrates KFDA's commitment to transparency and accountability in its operations and its responsiveness to the community's needs and feedback.

39. Detail how you will announce updates to tooling, pathway guidelines, parameters, and process alterations. • KFDA is considering the possibility of open-sourcing the entire platform once it reaches a sufficient level of maturity. This move would allow more allocators to utilize the platform, potentially broadening its impact and utility in the data management and allocation space. Additionally, KFDA is committed to actively listening to constructive feedback. This includes suggestions for improving various aspects of the platform, such as rules, verification processes, discount structures, and more, to better align with market sentiments and needs. This approach demonstrates KFDA's dedication to continuous improvement and responsiveness to user feedback.

40. How long will you allow the community to provide feedback before implementing changes? • Dedicated KFDA GitHub Repo to collect advice and disputes. Slack channel on filecoin slack for support. Wechat group for Chinese speaking community.

41. Regarding security, how will you structure and secure the on-chain notary address? If you will utilize a multisig, how will it be structured? Who will have administrative & signatory rights? • Given the automated nature of KFDA, it will employ standard IT DevSecOps practices to ensure the security of the notary address. Fei Yan, the creator of KFDA, brings a wealth of experience in DevSecOps from his years working in large enterprise environments within the insurance and government industries in Canada. This background equips him with the necessary expertise to implement robust security measures and maintain the integrity of KFDA's operations.

42. Will you deploy smart contracts for program or policy procedures? If so, how will you track and fund them? • Not initially. However, in a later phase, there are plans to introduce a KFD token. This token would be available to qualified clients, who could exchange it for datacap. Furthermore, the KFD token is expected to be listed on various decentralized finance (DeFi) exchanges, allowing the market to determine its value. This strategy indicates a move towards integrating blockchain-based incentives and market mechanisms within the KFDA ecosystem.

ghost commented 8 months ago

Monetization

43. Outline your monetization models for the services you provide as a notary allocator pathway. • When a client is prepared to receive a tranche of datacap and their data has successfully passed verification, they can proceed to pay for the datacap in Fil, with a set price per TiB. Once the payment transaction is detected, the datacap is automatically issued to the client. The starting price for this service is set at 0.5 Fil per TiB. This automated process ensures efficiency and transparency in the allocation of datacap to clients.

Various discount can be applied to the datacap price, such as Dataset is a AWS public dataset, 5% off Dataset is not overly crowded, 5% off Client is working with SPs using green energy, 5% off Client is working with new SPs, 5% off Client is working with established SPs, 5% off Client is working with SPs in certain region, 5% off Client is working with SPs with higher retrieval speed, 5% off Repeat client, 5% off Therefore if client checks for all above, 40% off bring it down to 0.3F / TiB More discounts in discovery

44. Describe your organization's structure, such as the legal entity and other business & market ventures. • Sole corporation: Kernelogic Software Inc, registered in Canada.

45. Where will accounting for fees be maintained? • N/A

ghost commented 8 months ago

46. If you've received DataCap allocation privileges before, please link to prior notary applications.https://github.com/filecoin-project/notary-governance/issues/658

47. How are you connected to the Filecoin ecosystem? Describe your (or your organization's) Filecoin relationships, investments, or ownership. • Kernelogic as a notary, has allocated over 250 PiBs of datacap to clients, #1 on the leaderboard. Kernelogic participated Space Race, Slingshot 1 and 2. Kernelogic helped Fil+ team to analyze all AWS public datasets for Slingshot v3. Kernelogic was one of the two main developers of Singularity V1. Kernelogic as a client, has onboarded over 100 PiBs of public dataset to the Filecoin network. 2 time dev grant receiver: MusicNFT, Singularity Humans of web3 interview: https://mirror.xyz/joinradius.eth/vwmrtaEcas5Ei3Ff6kNor6FB2i9zTvp0SMRuxOv6l_A Filecoin blog mentions: https://filecoin.io/blog/posts/large-datasets-kernelogic/

48. How are you estimating your client demand and pathway usage? Do you have existing clients and an onboarding funnel? • Kernelogic has allocated over 250 PiBs of datacap. Therefore KFDA estimates will allocate again 250PiBs of datacap in the first year.

galen-mcandrew commented 6 months ago

Datacap Request

Address

f1yvo2nutvy6a4ortrvv2tguhpsi64a7fgrc42owy

Datacap Allocated

5PiB

filplus-bot commented 6 months ago

The request has been signed by a new Root Key Holder

Message sent to Filecoin Network

bafy2bzacedqtupcd6slqasvuyf645d2swugyjqumyw62dzbvxlb4tgrfpv4hi

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedqtupcd6slqasvuyf645d2swugyjqumyw62dzbvxlb4tgrfpv4hi