NateWebb03 / FilTestRepo

A test repository for allocator application automation
1 stars 0 forks source link

Test app 1083 #1085

Open NateWebb03 opened 5 months ago

NateWebb03 commented 5 months ago

Notary Allocator Pathway Name:

(StorSwift) Manual

Organization:

StorSwift

Allocator's On-chain addresss:

f1id4vdnfmhic5lgf33jpwdugqphso7kmauc2w4va

Country of Operation:

Singapore

Region(s) of operation:

Africa ,Asia minus GCR,Greater China,Europe,Oceania,Japan,North America,South America,Other

Type of allocator: What is your overall diligence process? Automated (programmatic), Market-based, or Manual (human-in-the-loop at some phase). Initial allocations to these pathways will be capped.

Manual

Amount of DataCap Requested for allocator for 12 months:

100PiB

Is your allocator providing a unique, new, or diverse pathway to DataCap? How does this allocator differentiate itself from other applicants, new or existing?

As a V4 notary, we deeply engaged in Fil+ program and almost attended every governance call last year. We witnessed the chaos going on through all the year. The urgent task is to make the Fil+ more permissionless, but complete automizing process is hard to implement for the current phase. So we plan to combine manual due diligence and automized tooling to make the allocation process as objective as possible. The rules will get tighter to reduce fraud and cheating. The critical changes are listed as below:

  1. The age of GitHub and official website domain that are used for submitting application should be at least 3 months old.
  2. Fil+ Registration Form https://form.jotform.com/231862753015656 should be added into application template as a mandatory task to complete.
  3. KYC provider from 3rd party shall be implemented for confirm the identity of the clients. If any dispute is raised against SPs, KYC is applicable to SPs as well.
  4. Auto-assign 3 or more allocators to sign one tranche of applications. One allocator is allowed to sign the same application once only. That can increase the bar to collude with each other and reduce cheating and self-dealing.
  5. Voting tooling is adopted for community governance. StorSwift developed Power Voting that can initiate a vote on dispute issue. Everybody can cast a vote once a dispute proposal submitted. That is fair and transparent. Besides, as a technical infrastructure provider, we do not rule out the possibilities of deploying smart contracts to automize the process completely in the future.

As a member in the Filecoin Community, I acknowledge that I must adhere to the Community Code of Conduct, as well other End User License Agreements for accessing various tools and services, such as GitHub and Slack. Additionally, I will adhere to all local & regional laws & regulations that may relate to my role as a business partner, organization, notary, or other operating entity. * You can read the Filecoin Code of Conduct here: https://github.com/filecoin-project/community/blob/master/CODE_OF_CONDUCT.md

Acknowledgment: Acknowledge

Cient Diligence Section:

This section pertains to client diligence processes.

Who are your target clients?

Enterprise Data Clients,Small-scale developers or data owners

Describe in as much detail as possible how you will perform due diligence on clients.

  1. A application link with good history says everything for old clients.
  2. In general, apart from the general information provided from github applications. Fil+ Registration Form is a must requirement.
  3. Besides, the 3rd party KYC providers including Toggle and Veriff are recommended for identify authentication for clients. If those providers can not cover all regions, clients are allowed to use other KYC tools like Qichacha and Tianyancha for Chinese users. It’s more authoritative and objective. That makes the due diligence process much easier and more transparent.
  4. Additionally, we will pay special attention to SP distribution plan, particularly VPN use and official website, dataset ownership and size. Thus, we are going to ask the questions listed in question 12 below to clarify the details of application.
  5. Plus, to avoid identity stealing issue, a punishment scheme should be prepared. If someone uses fake identity or pretending to be someone else. Then the entity or individuals should be removed from Filecoin ecosystem permanently.

Please specify how many questions you'll ask, and provide a brief overview of the questions.

Because the basic information is included in application template, we list 12 very practical questions for clients below:

  1. Why do you store your data on Filecoin?
  2. If KYC provider does not cover your region, will you provide other legal identity materials like business license & identity card to prove your identity?
  3. Can you describe the relationship between your business operations and dataset?
  4. How much is your dataset size?How can you prove it?
  5. If the dataset was stored before by someone else. What will you do?
  6. Do you have a requirement for retrieval success rate for SPs?
  7. Any preference for retrieval methods, like http?
  8. Will there be a deal contract between client and SPs? Any back-up plan if SPs do not follow the terms as agreed in contract.
  9. Will you allow the usage of VPN for SPs? Why?
  10. If your dataset stored accidentally is against the rules and regulations of your country? What you take all the responsibilities?
  11. If the CID report and retrieval rate of first tranche are not as expected, the allocator refused to sign next allocation to you? What would you do?
  12. How will you handle the disputes against you?

Will you use a 3rd-party "Know your client" (KYC) service?

Toggle and Veriff are good choices because they have operations globally. Hopefully those two providers can cover all countries. Client can choose one from them. But if not, business license, entity number or utility bills are also good ways to prove the identity of an organization. Clients also can provide KYC proof from other 3rd party provider, like Qichacha or Tianyancha for Chinese businesses. Other allocators from the same region will help evaluate the effectiveness of the proof. The screenshot&link of KYC results, or proof files will be shared on github to be accessed by everyone.

Can any client apply to your pathway, or will you be closed to only your own internal clients? (eg: bizdev or self-referral)

Any client can apply. We can offer help and guides on how to apply dataCap where necessary.

How do you plan to track the rate at which DataCap is being distributed to your clients?

We will track the rate and performance with the help of CID checker, SA bot and A/C bot in real time. When remaining dataCap is less than 25%, SA bot will trigger the request for next allocation. In this way, the rate at which dataCap is being distributed is scientific. In the past year, some applicants are granted dataCap but reluctant to use it for a long time. To avoid this kind of dataCap waste, we recommended to set a duration like 40 days for client to use the dataCap. When the deadline is due, the dataCap will be removed by A/C bot without excuse.

Data Diligence

This section will cover the types of data that you expect to notarize.

As a reminder: The Filecoin Plus program defines quality data is all content that meets local regulatory requirements AND • the data owner wants to see on the network, including private/encrypted data • or is open and retrievable • or demonstrates proof of concept or utility of the network, such as efforts to improve onboarding

As an operating entity in the Filecoin Community, you are required to follow all local & regional regulations relating to any data, digital and otherwise. This may include PII and data deletion requirements, as well as the storing, transmitting, or accessing of data.

Acknowledgement: Acknowledge

What type(s) of data would be applicable for your pathway?

Public Open Dataset (Research/Non-Profit),Public Open Commercial/Enterprise

How will you verify a client's data ownership? Will you use 3rd-party KYB (know your business) service to verify enterprise clients?

We put focus on public datasets. It’s less sensitive generally. Based on KYC results from 3rd party and company introduction from application itself, we can tell the business scope of the company and figure out the relationship between business operations and dataset claimed in application. In this way, if the data ownership still can not be confirmed. Then we can schedule a meeting to clarify the controversy between client and allocator, Fil+ governance team member can be invited as well for transparency. The meeting recordings link will be shared and saved as proof on Github. If the dataset is not generated from business or is highly sensitive, then a legal letter of authorization is needed before signing.

How will you ensure the data meets local & regional legal requirements?

First, we have years of experience in Filecoin storage since 2020. We know the rules and guidelines clearly and never involved in a dispute before as a v4 notary. What’s more, we have legal&compliance staff for all legal affairs. For any behaviors or actions that may breach the law or regulations, we will consult him first to avoid unnecessary trouble.

What types of data preparation will you support or require?

CAR files should be prepared for encapsulation for technical clients. 32/64G are perfect file sizes. If clients are non-technical, he can seek help from sealing as a service. Also we can provide training and educational materials for clients if necessary.

What tools or methodology will you use to sample and verify the data aligns with your pathway?

One-off retrieval is used to confirm if or not the dataset stored is the same as described in application. the retrieval test should be done at regular basis. And the dataset should be greater than 60% of sector size to reduce sector abuse.

Data Distribution

This section covers deal-making and data distribution.

As a reminder, the Filecoin Plus program currently defines distributed onboarding as multiple physical locations AND multiple storage provider entities to serve client requirements.

Recommended Minimum: 3 locations, 4 to 5 storage providers, 5 copies

How many replicas will you require to meet programmatic requirements for distribution?

5+

What geographic or regional distribution will you require?

3 continents at least

How many Storage Provider owner/operators will you require to meet programmatic requirements for distribution?

5+

Do you require equal percentage distribution for your clients to their chosen SPs? Will you require preliminary SP distribution plans from the client before allocating any DataCap?

SP template will be used to ensure a smooth sealing process and also compliance. Clients have the right to distribute the dataCap to their own SPs at their own discretion as long as they follow the guidelines issued by Fil+ governance team. But single SP will not take over 25% of total deal. SP distribution plans are required from client. SP IDs, SP location, SP organization, total amount of dataCap received, retrieval rate, VPN usage are important metrics to determine correctly the reasonable percentage to specific SPs. The template link is https://docs.google.com/spreadsheets/d/1F5NfzLm1JP59VMe_hKh7jDDY3wySaQwo/edit#gid=1300337083 After signing off the dataCap, we will keep monitoring if or not the storage is carried out as planned. If not, we will ask them to make modifications.

What tooling will you use to verify client deal-making distribution?

datacapstats.io and CID checker will be used to help clients verify SP performance. We can offer necessary help for clients on how to use CID checker and read relevant metrics. Additionally, datacapstats.io is very useful for clients to track their SPs. Apart from the basic information provided on the website, but there is still room for improvement. It’s advised that different roles including clients, SPs and allocators can log in with their addresses, they can see the deal-making details against their own addresses just like a wallet. datacapstats.io should be an all-in-one website that every role can manage their businesses. And the statistics could be more graphical and intuitive. The data displayed on website should be in real time.

How will clients meet SP distribution requirements?

It’s a manual process. When submitting application on GitHub, the SP distribution plan and SP details are required. Clients determine the distribution plan completely. When it comes to new clients, we can offer a list of reputable SPs or provide websites or channels for finding qualified SPs like Filswan or by Slack channels. In addition, as a technical infrastructure provider, we always seek an innovative way to solve the distribution and allocation dilemma. If there is other tooling or API available, we are very supportive to adopt those tools. We can even help develop and optimize the tooling to make it a better fit.

As an allocator, do you support clients that engage in deal-making with SPs utilizing a VPN?

A few countries have blocking issue from government. It is inevitable to use VPN to carry out blockchain business. Most of the participants of Filecoin system are from Asia. If VPN is banned, that may harm the growth of entire ecosystem. So VPN can not be banned completely. We support reasonable usage of VPN. But it is found that some SPs fake locations, that is not allowed as well. There is no effective way to detect VPN fraud as far as I know. If any, we are supportive to use that tool. In SP distribution plan, client is required to fill out the information about SP VPN usage. If SP is found cheating on VPN usage, he will be removed from Filecoin system.

DataCap Allocation Strategy

In this section, you will explain your client DataCap allocation strategy.

Keep in mind the program principle over Limited Trust Over Time. Parties, such as clients, start with a limited amount of trust and power. Additional trust and power need to be earned over time through good-faith execution of their responsibilities and transparency of their actions.

Will you use standardized DataCap allocations to clients?

Yes, standardized

Allocation Tranche Schedule to clients:

• First: 5% of the total requested DataCap (50TiB at most for new client) • Second: 200% of the first tranche • Third: 200% of the previous tranche • Fourth: 200% of the previous tranche • Max per client overall: 200% of the previous tranche (The total not greater than 5PiB)

Will you use programmatic or software based allocations?

No, manually calculated & determined

What tooling will you use to construct messages and send allocations to clients?

Notary registry is preferred to construct messages and send allocations. And the registry can be optimized to offer an API as tracker for every allocator. That can integrate the statistics from Notary leaderboard and dataset.io/notary. So that allocator can clearly see history and manage the progress of every application signed.

Describe the process for granting additional DataCap to previously verified clients.

SA bot is used to monitor the usage of dataCap, when the remaining dataCap is below 25%, then trigger the next tranche. At the same time, A/C bot will automatically check the metrics. According to A/C bot proposal, it is a great idea to allocate subsequent allocation automatically if CID report and retrieval performance meet requirements. If the report fails the requirements, then manual due diligence and signing process will take over. And detailed improvement plan is required before signing the additional dataCap. Only one chance offered to client with bad history, otherwise, the application will be closed.

Tooling & Bookkeeping

This program relies on many software tools in order to function. The Filecoin Foundation and PL have invested in many different elements of this end-to-end process, and will continue to make those tools open-sourced. Our goal is to increase adoption, and we will balance customization with efficiency.

This section will cover the various UX/UI tools for your pathway. You should think high-level (GitHub repo architecture) as well as tactical (specific bots and API endoints).

Describe in as much detail as possible the tools used for: • client discoverability & applications • due diligence & investigation • bookkeeping • on-chain message construction • client deal-making behavior • tracking overall allocator health • dispute discussion & resolution • community updates & comms

Client discoverability & Applications: Slack is the most popular way to interact with each other. Client will apply for dataCap on Github as before. Due diligence & investigation: based on the basic information from github application, 3rd party KYC provider like veriff, toggle or Qichacha are available for options. Bookkeeping: All applications and communication records are documented on GitHub, anyone will have access to the records. If video conferences are held, they are archived on Youtube.  On-chain message construction: Mainly Notary Registry https://filplus.fil.org/#/ or https://filplus-registry.netlify.app/. Client deal-making behavior: datacapstats.io and CID checker& retrieval method to track the deal activities. Tracking overall allocator health: every allocator creates own tracker (Googleform or github)for allocation activities online and then made them public. If possible, notary registry can be improved as allocator trackers as well in the future, it is a more unified management way then. Dispute discussion & resolution: A new channel should be created only for dispute discussion. StorSwift has developed a voting tool named Power Voting, we plan to use it for dispute resolution. Creating a proposal on a dispute and enter the details, proofs and links, offering 2+more solutions for voting. Every community member can vote against a specific dispute. It’s the easiest way to solve disputes(The github repo is https://github.com/black-domain/power-voting) Community updates & comms: The allocator call should be held more often; updates and progress should be made public timely. Bi-weekly notary call should be re-scheduled to once a week. #annoucement channel should be created in Slack for all kinds of updates. So community members have no need to search the whole Filecoin slack for new updates.

Will you use open-source tooling from the Fil+ team?

We will use all tools from Fil+ team. gitHub Repo, CID checker, retrieval bot. SA bot. datacapstats.io and notary registry and A/C bot.

Where will you keep your records for bookkeeping? How will you maintain transparency in your allocation decisions?

All our allocation decisions and proofs will be recorded in a public github repo. That will include application link, address, the amount of dataCap signed, the date of signing, tranche number, the reason why signing it. The sensitive information about the client sent privately will be saved separately in case any of dispute arising from allocation process. Contact us on Slack for more info, contacts are disclosed the last part. We will run command to view CID report and do retrieval test every 2 weeks to accelerate the sealing process.

Risk Mitigation, Auditing, Compliance

This framework ensures the responsible allocation of DataCap by conducting regular audits, enforcing strict compliance checks, and requiring allocators to maintain transparency and engage with the community. This approach safeguards the ecosystem, deters misuse, and upholds the commitment to a fair and accountable storage marketplace.

In addition to setting their own rules, each notary allocator will be responsible for managing compliance within their own pathway. You will need to audit your own clients, manage interventions (such as removing DataCap from clients and keeping records), and respond to disputes.

Describe your proposed compliance check mechanisms for your own clients.

We will create a tracker for every details about the application we signed, such as application link, organization, tranche number, the date of signing the application, the date of running out of dataCap(expected),the reason why signing this application. Regular check-ins: we will run command to view CID report and retrieval bot report to audit the dataCap distribution statistics at the end of month. We will pay special attention to Date of Running Out of DataCap(Expected)in the form, to detect any abnormalities and warn clients to start sealing ASAP. For new clients, a small amount of dataCap will be granted to them for the first application. The trust is earned over time if they follow the rules and then larger amount of dataCap will be granted to them. If new clients made honest mistakes, he still got one chance to justify himself. But the dataCap distributing plan maybe extended as well.

Describe your process for handling disputes. Highlight response times, transparency, and accountability mechanisms.

The current dispute online tracker turns out not a good way to solve disputes. Submit a dispute on GitHub is not efficient and it takes a long time and efforts to process the dispute. Sometimes, dispute just ends without conclusion. We StorSwift developed a voting tool called Power voting. Voting is the most transparent way to solve disputes. Anyone who has any type of disputes or disagreement among community members can submit a proposal including issue description, proof, link and solutions to vote. The Fil+ governance team will review the poll first and then set up a deadline to vote. The poll ends and issue is fixed as well. It’s a perfect way to solve disputes no matter in response time or transparency. The github repo is https://github.com/black-domain/power-voting, Power Voting dApp utilizes Drand Timelock, StorSwift ZK-KYC and Subgraphs technologies to achieve fair and private voting. Before the voting deadline, no one’s voting results will be seen by others, and the voting process will not be disturbed by other participant’s voting results. After the voting deadline, anyone can count the votes in a decentralized manner, and the results of the counting will be executed and stored by smart contract and will not be manipulated by any centralized organization or individual.

Detail how you will announce updates to tooling, pathway guidelines, parameters, and process alterations.

For now, every moderator releases their own updates in respective channel on Slack and people are allowed to chat as well, it is hard to find useful updates timely and easily. So #annoucement channel on Slack will be created. All important updates will be shared in this channel on weekly basis. Community members are not allowed to talk in this channel. All updates no matter small or critical will be released in this channel. Speaking of critical changes about allocation process, a detailed proposal shall be drafted, governance team and community members will evaluate its feasibility together. Once the proposal is approved by the community, we will discuss it in governance call and then share the meeting minutes in #annoucement channel.

How long will you allow the community to provide feedback before implementing changes?

Two weeks for small changes and a month for critical ones (holidays not included)for collecting feedbacks. Appoint one moderator(like slack @Young) to collect relevant information and save it in a form named FAQs. Book a meeting with governance team, discuss how to proceed with the updates. No matter if the change will be accepted or rejected finally, the reason will be shared publicly. Not all changes will be carried out. Discussion is welcome, new ideas make the ecosystem grow faster and better definitely.

Regarding security, how will you structure and secure the on-chain notary address? If you will utilize a multisig, how will it be structured? Who will have administrative & signatory rights?

Ledger is used to protect the address as before. Allocator team is a group of 4, 3 are responsible for due diligence and writing comments in our own tracker explaining why signing a specific application. The only one will do the final verification and sign the application. The truth is only one of 4 has the administrative & signatory rights to ensure the security of the address. The fewer people know, the lower the risk.

Will you deploy smart contracts for program or policy procedures? If so, how will you track and fund them?

As an infrastructure technical provider, we have been considering writing smart contract to end the disordered circumstance for long. We have a professional technical team and also we deeply engaged in Fil+ since the beginning. If there is a need from Fil+ team, we can book a meeting to nail the cooperation details. Power Voting is built by our team, the reviewer can take a look for reference.

Monetization

While the Filecoin Foundation and PL will continue to make investments into developing the program and open-sourcing tools, we are also striving to expand and encourage high levels of service and professionalism through these new Notary Allocator pathways. These pathways require increasingly complex tooling and auditing platforms, and we understand that Notaries (and the teams and organizations responsible) are making investments into building effective systems.

It is reasonable for teams building services in this marketplace to include monetization structures. Our primary guiding principles in this regard are transparency and equity. We require these monetization pathways to be clear, consistent, and auditable.

Outline your monetization models for the services you provide as a notary allocator pathway.

We are not going to adopt staking & slashing system for clients, SPs and allocators. Because fees can not reduce collusion, but increasing the possibility of bribery and self-dealing. We will prepare an incentive program for allocators only, the program will be held for twice a year. Based on allocators performance including the number of applications signed, the amount of dataCap granted, governance call attendance times, the number of disputes, every allocator will be scored and top 10 will be awarded a certain amount of FIL or dataCap. And the rubrics and prizes are subject to changes. It’s just an initial scheme.

Describe your organization's structure, such as the legal entity and other business & market ventures.

StorSwift is a Singapore-based technology company that specializes in Web3 infrastructure solution with its branch offices in Shanghai & Wuhan. It provides technical services globally. StorSwift has tons of experience in developing, deploying and maintaining large-scale storage and computing systems. We actively bring in enterprise storage, devops and security technology to Web3 industry. Since 2018, StorSwift has made enormous contributions to the ecosystem of IPFS & Filecoin, involving hardware & software solutions, security enhancement, implementing patches and dedicated software modules.

Where will accounting for fees be maintained?

No staking& burning mechanism is included in our plan. The only fees are about KYC and allocator incentive program. Those fees are fixed prices. So traditional accounting is sufficient. In the future, if smart contract is implemented to go through the allocation process, everything will be maintained on chain.

If you've received DataCap allocation privileges before, please link to prior notary applications.

Notary Application: StorSwift · Issue #663 · filecoin-project/notary-governance (github.com)

How are you connected to the Filecoin ecosystem? Describe your (or your organization's) Filecoin relationships, investments, or ownership.

Firstly,StorSwift, as a Web3 technical team, has provided multiple patches to Filecoin projects, including: https://github.com/filecoin-project/lotus/pull/6658 https://github.com/filecoin-project/lotus/pull/8751 https://github.com/filecoin-project/lotus/pull/8854 https://github.com/filecoin-project/lotus/pull/8787 https://github.com/filecoin-project/lotus/pull/8696 https://github.com/filecoin-project/lotus/pull/8545 https://github.com/filecoin-project/lotus/pull/7310 https://github.com/filecoin-project/lotus/pull/7027 Besides, StorSwift has also developed related Filecoin modules, and successfully got the dev-grants. Please note the following examples and links. 1. Multi-Sector Memory Pool Support for the Sealing Precommit Phase1 This module seeks to largely reduce memory consumption for PreCommit Phase. Hence, more tasks can run on the same machine, which effectively cuts costs. Code: https://github.com/storswiftlabs/rust-fil-proofs-mpool Documents: https://github.com/storswiftlabs/rust-fil-proofs-mpool/blob/master/storage-proofs-porep/src/stacked/vanilla/create_label/mem_pool_arch.md 2. Lotus offline signature solution This module makes Filecoin accounts more secure. That means there is no need to keep the owner keys and other keys on the servers. Code: https://github.com/storswift/lotus-offline-sign 3. Decentralized voting tool https://github.com/black-domain/power-voting

How are you estimating your client demand and pathway usage? Do you have existing clients and an onboarding funnel?

As far as we know, research institutions have huge demand for training data for large language models, such as LAION-5B which has 800T. AI & machine learning are trending projects both in web2 and web3. And we have years of experience and huge customer base in traditional storage. We will see booming growth in 2024.