NateWebb03 / FilTestRepo

A test repository for allocator application automation
1 stars 0 forks source link

Test 3 #3

Open NateWebb03 opened 9 months ago

NateWebb03 commented 9 months ago

Issue 3

NateWebb03 commented 8 months ago

Pathway Name:

Tech Greedy Datacap Allocator

Organization:

Tech Greedy

Allocator's On-chain addresss:

f1lubkhti5swayk6v25e47f3dtgwbudewrgncfr5y

Country of Operation:

USA

Region(s) of operation:

Asia minus GCR,Greater China,Europe,Oceania,North America,South America,Japan,Africa

Type of allocator: What is your overall diligence process? Automated (programmatic), Market-based, or Manual (human-in-the-loop at some phase). Initial allocations to these pathways will be capped.

Automatic

Amount of DataCap Requested for allocator for 12 months:

200 PiB

Is your allocator providing a unique, new, or diverse pathway to DataCap? How does this allocator differentiate itself from other applicants, new or existing?

Yes, the Tech Greedy Datacap Allocator offers a unique and innovative pathway in the DataCap allocation landscape, differentiated by our specialized focus on automated compliance and expertise in data preparation for Filecoin. Our primary differentiator is our development of automated tooling and workflows using GitHub Actions in Tech Greedy's public GitHub repository. This approach allows us to streamline the allocation process while maintaining transparency and separation from the current FilPlus flow.

Key aspects of our pathway include:

Specialized Data Preparation Expertise:

Our deep understanding of effective data preparation practices for Filecoin, supported by our maintenance of Singularity, a leading data preparation tool. Mandatory verification of clients' data preparation tools to ensure IPLD compliance and other innovative methodologies that suit specific client needs. Automated Tooling and Transparent Workflow:

We are developing automated tools and processes within our public GitHub repository to facilitate a transparent and efficient DataCap allocation process. This separation from the current FilPlus flow allows for greater transparency and independent tracking of allocations and compliance. Rigorous Metadata Verification:

A comprehensive metadata check process ensures that the data prepared for storage aligns accurately with the intended storage deals, maintaining data integrity on the network. Open Client Eligibility:

Our pathway is open to a diverse range of clients, provided they adhere to our rigorous due diligence process, emphasizing our commitment to supporting the growth and diversity of the Filecoin ecosystem. These aspects collectively render the Tech Greedy Datacap Allocator pathway distinct and essential for maintaining high standards of data quality and compliance in the Filecoin network. Our focus on automated processes and public accountability sets us apart from other pathways, furthering our commitment to enriching the Filecoin ecosystem with high-quality, useful data.

As a member in the Filecoin Community, I acknowledge that I must adhere to the Community Code of Conduct, as well other End User License Agreements for accessing various tools and services, such as GitHub and Slack. Additionally, I will adhere to all local & regional laws & regulations that may relate to my role as a business partner, organization, notary, or other operating entity. * You can read the Filecoin Code of Conduct here: https://github.com/filecoin-project/community/blob/master/CODE_OF_CONDUCT.md

Acknowledgment: Acknowledge

Cient Diligence Section:

This section pertains to client diligence processes.

Who are your target clients?

Small-scale developers or data owners,Other (specified above)

Describe in as much detail as possible how you will perform due diligence on clients.

Our due diligence process for determining client eligibility in the Tech Greedy Datacap Allocator pathway is comprehensive and multi-faceted, combining both human interaction and customized automated verification to ensure compliance and data integrity.

Client Interviews:

We conduct thorough interviews with potential clients to understand their data preparation processes. These interviews aim to ascertain that clients have a clear understanding of how to correctly prepare data for onboarding to the Filecoin network. We evaluate their awareness of best practices in data preparation and their commitment to adhering to these standards.

Metadata Verification:

A critical part of our due diligence involves verifying the metadata generated by the data preparation tool. We ensure that this metadata accurately reflects the data being prepared for storage deals. This step is crucial to confirm that the data prepared aligns with what is being stored, thereby maintaining the integrity of the data on the network.

Custom Automated Bot Reports:

Instead of relying on bots provided by the Filecoin Foundation, we are building our own automated bots tailored to our specific requirements and standards. These custom bots will generate reports on deal distribution, retrievability, and other relevant metrics. These reports provide an objective measure of compliance and are used to verify that the clients are executing their storage deals in line with the commitments made during the application process. The custom bot reports also help us monitor ongoing compliance and the effectiveness of the data preparation process, ensuring continuous adherence to Fil+ standards.

Please specify how many questions you'll ask, and provide a brief overview of the questions.

During our due diligence process, we plan to ask clients a series of targeted questions focusing on their data preparation practices and the compliance of their chosen tools. The questions will be structured to gain a comprehensive understanding of their data preparation methodologies and the compatibility of these methods with Filecoin’s retrieval protocols. Here is an overview of the questions:

Data Preparation Tool Identification:

What data preparation tool are you using for onboarding data onto Filecoin? Please provide a brief description of why you chose this specific tool.

Compliance and Protocol Support:

Is your chosen data preparation tool IPLD compliant? If yes, please explain how. Does your tool support UnixFS? Please elaborate on its implementation. What retrieval protocols does your tool support (e.g., HTTP, GraphSync, Bitswap)? Describe how each protocol is supported and its benefits for data retrieval.

Metadata Examination:

Can you provide detailed methods for us to examine the metadata generated by your data preparation tool? How does the metadata reflect the actual data prepared for storage deals? Please provide examples or case studies if available.

Data Retrieval Efficiency:

How does your chosen tool enhance the efficiency of data retrieval on the Filecoin network? Are there any specific features or capabilities in your tool that optimize retrieval speed or accessibility?

Tool Maintenance and Updates:

How is your tool maintained and updated? Please describe the process for ensuring it stays compliant with evolving standards and protocols in the Filecoin ecosystem. Backup and Contingency Plans:

In case of any issues with your primary data preparation tool, do you have a backup tool or contingency plan? Please describe your strategy for such scenarios. These questions are designed to ensure that our clients are not only using tools that are compliant with Filecoin’s standards but also contribute positively to the network’s efficiency and reliability. By requiring detailed responses to these questions, we aim to maintain a high standard of data quality and retrievability in our pathway.

Will you use a 3rd-party "Know your client" (KYC) service?

No, the Tech Greedy Datacap Allocator does not utilize a traditional 3rd-party KYC service. However, we have implemented a robust due diligence process that leverages technological tools and community-based insights to assess the trustworthiness and location of our clients.

IP Location Verification:

We use ipinfo.io, a reliable IP location service, to gain insights into the geographic location of Storage Providers (SPs). This service helps us determine whether the IPs are cloud-based or if VPNs are being used, which is crucial for verifying the actual operational regions of the SPs. Such geographic verification is vital for ensuring compliance with regional data storage requirements and for understanding the distribution of our clients' storage solutions.

Community Engagement and History Check:

To gauge the trustworthiness and reliability of our clients, we conduct thorough checks of their engagement and history within the Filecoin community. This involves reviewing their interactions and contributions on platforms like Filecoin Slack and GitHub. By examining their community involvement, we can assess their commitment to the Filecoin ecosystem and their adherence to its standards and ethics. Our approach, while not relying on traditional KYC services, is tailored to the unique nature of the Filecoin network. It allows us to effectively evaluate our clients' trustworthiness and operational integrity, ensuring that our Datacap allocations are made to reputable and compliant entities within the Filecoin community.

Can any client apply to your pathway, or will you be closed to only your own internal clients? (eg: bizdev or self-referral)

Our pathway, Tech Greedy Datacap Allocator, is open to all clients who wish to participate in the Filecoin network, regardless of their background or affiliation. We welcome a diverse range of clients, from individual developers to large enterprises, as long as they are willing to adhere to our rigorous due diligence process.

How do you plan to track the rate at which DataCap is being distributed to your clients?

We will build the tool using Github actions to check the speed of datacap consumption.

Data Diligence

This section will cover the types of data that you expect to notarize.

As a reminder: The Filecoin Plus program defines quality data is all content that meets local regulatory requirements AND • the data owner wants to see on the network, including private/encrypted data • or is open and retrievable • or demonstrates proof of concept or utility of the network, such as efforts to improve onboarding

As an operating entity in the Filecoin Community, you are required to follow all local & regional regulations relating to any data, digital and otherwise. This may include PII and data deletion requirements, as well as the storing, transmitting, or accessing of data.

Acknowledgement: Acknowledge

What type(s) of data would be applicable for your pathway?

Public Open Dataset (Research/Non-Profit),Public Open Commercial/Enterprise

How will you verify a client's data ownership? Will you use 3rd-party KYB (know your business) service to verify enterprise clients?

At Tech Greedy Datacap Allocator, our method for verifying a client's data ownership varies depending on the type of data they intend to store on the Filecoin network:

For Open Datasets:

We do not require additional verification for clients who wish to store open datasets. Our rationale is that open datasets, by their nature, are publicly available and do not possess the same proprietary or sensitive characteristics as commercial datasets.

For Commercial Datasets:

Clients intending to store commercial datasets are subject to a more stringent verification process. We require these clients to send us an email from their official business email address. This email must be digitally signed using their domain's private key to authenticate its origin. We strictly do not accept free email addresses (such as Gmail, Hotmail, etc.) for this verification process. The use of an official business email address ensures a level of legitimacy and ownership. This process helps us ascertain that the client possesses the rights to the data they are seeking to store, aligning with our commitment to the integrity and legality of data on the Filecoin network.

How will you ensure the data meets local & regional legal requirements?

To ensure data meets local and regional legal requirements, we will require clients to submit a compliance declaration, affirming that their data adheres to relevant laws and regulations in their jurisdiction.

What types of data preparation will you support or require?

We will not provide data prep service and we require data preparation tool to be open source

What tools or methodology will you use to sample and verify the data aligns with your pathway?

As part of metadata examination, we will make sure they are preparing the data correctly, including not using excess sector padding.

Data Distribution

This section covers deal-making and data distribution.

As a reminder, the Filecoin Plus program currently defines distributed onboarding as multiple physical locations AND multiple storage provider entities to serve client requirements.

Recommended Minimum: 3 locations, 4 to 5 storage providers, 5 copies

How many replicas will you require to meet programmatic requirements for distribution?

3+

What geographic or regional distribution will you require?

We will require three different physical locations

How many Storage Provider owner/operators will you require to meet programmatic requirements for distribution?

3+

Do you require equal percentage distribution for your clients to their chosen SPs? Will you require preliminary SP distribution plans from the client before allocating any DataCap?

Yes, we require preliminary SP distribution before allocating any datacap. We will track this using a template.

What tooling will you use to verify client deal-making distribution?

We will build our own github based tooling to check deal making distribution. The bot may be a fork of the CID checker bot.

How will clients meet SP distribution requirements?

We will let clients to implement their own deal making solution and use tooling to monitor the deal distribution.

As an allocator, do you support clients that engage in deal-making with SPs utilizing a VPN?

We do not allow utilizing a VPN unless they disclose the actual SP location.

DataCap Allocation Strategy

In this section, you will explain your client DataCap allocation strategy.

Keep in mind the program principle over Limited Trust Over Time. Parties, such as clients, start with a limited amount of trust and power. Additional trust and power need to be earned over time through good-faith execution of their responsibilities and transparency of their actions.

Will you use standardized DataCap allocations to clients?

No, client specific

Allocation Tranche Schedule to clients:

First: Smaller of 100% of the dataset size, or 25% of overall datacap request Second: Smaller of 100% of the dataset size, or 25% of overall datacap request Third: Smaller of 100% of the dataset size, or 25% of overall datacap request Fourth: Smaller of 100% of the dataset size, or 25% of overall datacap request

Will you use programmatic or software based allocations?

Yes, standardized and software based

What tooling will you use to construct messages and send allocations to clients?

We will start with a manual tool to construct and publish the message and transit to automated allocation later. We will not be utilizing the existing notary registry tool.

Describe the process for granting additional DataCap to previously verified clients.

We will look at the bot report before subsequent allocation. We will start with manual allocation with manual inspection and automate the allocation once we decide the criteria

Tooling & Bookkeeping

This program relies on many software tools in order to function. The Filecoin Foundation and PL have invested in many different elements of this end-to-end process, and will continue to make those tools open-sourced. Our goal is to increase adoption, and we will balance customization with efficiency.

This section will cover the various UX/UI tools for your pathway. You should think high-level (GitHub repo architecture) as well as tactical (specific bots and API endoints).

Describe in as much detail as possible the tools used for: • client discoverability & applications • due diligence & investigation • bookkeeping • on-chain message construction • client deal-making behavior • tracking overall allocator health • dispute discussion & resolution • community updates & comms

Will you use open-source tooling from the Fil+ team?

Yes, we will fork some open source tooling from FIL+ team, e.g. CID checker and retrieval bot

Where will you keep your records for bookkeeping? How will you maintain transparency in your allocation decisions?

For maintaining transparency and thorough bookkeeping of our allocation decisions, we will primarily use GitHub. This platform will serve as the central repository for all public records related to our allocator operations, including but not limited to:

Client application information (as permissible within privacy constraints) Due diligence processes and outcomes Allocation decisions and rationales Dispute resolution discussions and outcomes GitHub's public nature ensures that our actions are transparent and accountable to the community. It also facilitates easy access for community members and the Fil+ Governance team to review our activities and decisions.

In addition to GitHub, some private communications may occur via direct messages (DMs) on Slack, especially for sensitive matters or initial client communications. However, any conclusions or significant decisions arising from these private conversations will be summarized and published on GitHub. This practice ensures that while we maintain necessary confidentiality, key decisions and actions remain visible and auditable by the community.

What will your DataCap distribution look like?

We start with datacap distribution manually at first and transit into automated allocation utilizing github platform

Risk Mitigation, Auditing, Compliance

This framework ensures the responsible allocation of DataCap by conducting regular audits, enforcing strict compliance checks, and requiring allocators to maintain transparency and engage with the community. This approach safeguards the ecosystem, deters misuse, and upholds the commitment to a fair and accountable storage marketplace.

In addition to setting their own rules, each notary allocator will be responsible for managing compliance within their own pathway. You will need to audit your own clients, manage interventions (such as removing DataCap from clients and keeping records), and respond to disputes.

Describe your proposed compliance check mechanisms for your own clients.

Our compliance check mechanisms for clients at Tech Greedy Datacap Allocator are designed to ensure responsible DataCap distribution and effective utilization by our clients. The process includes:

Regular Check-Ins and Audits:

We will conduct regular check-ins with our clients to review their DataCap usage and storage deal performance. Audits will be carried out periodically to ensure compliance with the allocated DataCap and its intended use. Tracking DataCap Distribution Metrics:

We will closely monitor metrics related to DataCap distribution, such as the amount allocated, rate of utilization, and pattern of distribution among different clients. Our monitoring system will be automated to flag any unusual activities or discrepancies in DataCap usage. Understanding Client Demographics and Time Metrics:

We will maintain records of client demographics to understand the diversity of our client base and their specific needs. Time metrics related to DataCap requests, allocations, and utilization will also be tracked to assess the efficiency and responsiveness of our process. Trust Evaluations:

New clients will undergo thorough trust evaluations, assessing their history and experience in handling data, especially on the Filecoin network. Clients will need to demonstrate a track record of responsible data management and smaller-scale onboarding before being considered for larger DataCap allocations. Use of Automated Tools:

Tools like CID Checker and Retrievability Bot will be employed to automatically monitor and verify the integrity and retrievability of stored data. These tools will provide objective metrics to assess client compliance and the effectiveness of their data storage practices. Policy for New Clients:

New clients, especially those requesting large DataCap allocations, will be subject to stricter scrutiny. We will require evidence of their capability to responsibly onboard data to the Filecoin network. This may include proof of previous smaller-scale data onboarding. Clients without a proven track record will be initially allocated smaller amounts of DataCap to establish trust and demonstrate compliance. Our approach is designed to maintain a high standard of compliance while fostering responsible growth and diversification within the Filecoin network. We believe in empowering clients through education and support, while also upholding stringent measures to prevent misuse and ensure the network's integrity.

Describe your process for handling disputes. Highlight response times, transparency, and accountability mechanisms.

At Tech Greedy Datacap Allocator, we have established a clear and efficient process for handling disputes, which prioritizes transparency, accountability, and timely resolution:

Dispute Tracking on GitHub:

All disputes, whether internal (involving us and our clients) or external (involving other notaries or the Fil+ Governance Team), are systematically tracked and documented on GitHub. This approach ensures transparency, as the dispute details and our responses are accessible to the broader community.

Ownership of Internal Disputes:

We take full responsibility for resolving internal disputes with our clients. Our process involves an initial assessment of the dispute, direct communication with the client to understand their concerns, and a thorough investigation to gather all relevant information.

Response Time:

We aim to acknowledge and begin addressing any dispute within 48 hours of its filing. A complete resolution or a significant update on the progress is provided within a week, depending on the complexity of the issue.

Broadening the Audience for Complex Disputes:

For disputes that are complex or cannot be resolved internally, we are open to escalating the matter to the Fil+ Governance Team. This step ensures that disputes are reviewed and resolved with oversight from a broader and more diverse group of stakeholders within the Filecoin community.

Transparency and Accountability:

Throughout the dispute resolution process, we maintain a high level of transparency by documenting our findings, decisions, and actions on GitHub. We hold ourselves accountable to both our clients and the broader Filecoin community, ensuring our actions align with the community standards and governance protocols.

Detail how you will announce updates to tooling, pathway guidelines, parameters, and process alterations.

All updates to our tooling, guidelines, and processes will be announced on our GitHub repo and public Slack channels, with significant changes potentially leading to a new allocator application.

How long will you allow the community to provide feedback before implementing changes?

At Tech Greedy Datacap Allocator, we understand the importance of community feedback in shaping our operations and processes. To ensure an open, transparent, and responsive approach:

Feedback Period:

For each proposed change, we will provide a minimum feedback period of one week. This timeframe allows community members adequate opportunity to review and respond to the proposed updates. This period may be extended for more significant or complex changes, ensuring that all stakeholders have sufficient time to contribute their thoughts and concerns.

Community Engagement and Moderation:

Feedback will be gathered primarily through GitHub issues or discussions, platforms that foster structured and accessible dialogue. We will actively engage with the community on these platforms, providing clarifications, responding to queries, and facilitating discussions to ensure a productive exchange of ideas. Our team will moderate these discussions to maintain a constructive and respectful environment.

Weighing and Acting on Feedback:

All feedback will be thoroughly reviewed and considered. We aim to strike a balance between being agile in our decision-making and ensuring that community input significantly informs our actions. Decisions on whether to implement feedback will be made based on a combination of factors, including the feedback's relevance, feasibility, impact on security and compliance, and the broader interests of the Filecoin ecosystem.

Transparent Decision-Making:

Once a decision is made, whether to proceed with the proposed changes or to incorporate community feedback, we will communicate this clearly on the same platforms where feedback was solicited. We will provide rationale for our decisions, ensuring transparency in how community feedback influenced the outcome. By implementing this structured approach, we aim to foster a community-driven process where feedback is not only solicited but also meaningfully incorporated into our operations. This approach reflects our commitment to a decentralized and collaborative ecosystem, where diverse viewpoints are valued and contribute to the continuous improvement of our allocator pathway.

Regarding security, how will you structure and secure the on-chain notary address? If you will utilize a multisig, how will it be structured? Who will have administrative & signatory rights?

Automation of DataCap Allocation:

The process of allocating DataCap will be automated using GitHub Actions. This automation streamlines our operations and minimizes the risk of human error in the allocation process.

Securing Sensitive Information:

Critical secrets, such as private keys or sensitive access tokens, will be securely stored using GitHub Secrets. This feature of GitHub provides a secure and encrypted method to store sensitive information, ensuring that our automated processes are both efficient and secure.

No Multisig Structure:

Currently, we do not plan to utilize a multisig wallet structure for our on-chain notary address. The automation and security measures we have in place are designed to efficiently handle the allocation process while maintaining a high level of security.

Administrative and Signatory Rights:

The administrative and signatory rights within our system are managed through controlled access to our GitHub repository. Only authorized personnel within our organization will have access to these GitHub Secrets, ensuring that the rights to execute transactions or make changes are tightly regulated and monitored.

Will you deploy smart contracts for program or policy procedures? If so, how will you track and fund them?

We do not plan to deploy smart contracts for program or policy procedures at this stage. Our current focus is on streamlining our operations through existing tools and processes without the integration of smart contracts.

Monetization

While the Filecoin Foundation and PL will continue to make investments into developing the program and open-sourcing tools, we are also striving to expand and encourage high levels of service and professionalism through these new Notary Allocator pathways. These pathways require increasingly complex tooling and auditing platforms, and we understand that Notaries (and the teams and organizations responsible) are making investments into building effective systems.

It is reasonable for teams building services in this marketplace to include monetization structures. Our primary guiding principles in this regard are transparency and equity. We require these monetization pathways to be clear, consistent, and auditable.

Outline your monetization models for the services you provide as a notary allocator pathway.

At Tech Greedy Datacap Allocator, we have developed a monetization model that aligns with our commitment to transparency, equity, and sustainability. Our model is designed to cover the costs of the sophisticated tooling and auditing platforms we have developed and will maintain. The structure of our monetization is as follows:

Absorption of Initial Development Costs:

We will absorb the initial development costs of our allocation and auditing tools. This investment reflects our commitment to providing a high-quality service to the Filecoin community.

Flat Fee for Application Processing:

To cover ongoing operational costs, we will charge a flat fee for each application we process. This fee is set at 100 USD per application, payable in Filecoin (FIL). This fee covers the administrative costs associated with application review, due diligence, and initial allocation processing.

Variable Fee Based on DataCap Granted:

In addition to the flat application fee, we will charge a variable fee proportional to the amount of DataCap granted. This fee is estimated to be 2 USD per Terabyte (TiB) of DataCap allocated. This fee structure is designed to fund additional monitoring solutions, such as retrieval bots, which incur computational costs proportional to the number of deals managed.

Potential Fee Reductions:

We are continually working on software optimization and cost-efficiency measures. As these optimizations are realized, we may reduce the fees accordingly, passing these savings on to our clients.

Transparency and Auditing:

Our fee structure will be clearly communicated to all potential clients upfront. We will maintain transparent records of all fees charged, ensuring that our monetization practices are consistent, clear, and auditable.

Reinvestment in Service Quality:

The revenue generated from these fees will be reinvested into improving and maintaining the quality of our services, including software upgrades, enhanced security measures, and expanded client support.

Describe your organization's structure, such as the legal entity and other business & market ventures.

We are a C corp established in state of Washington, USA

Where will accounting for fees be maintained?

All transaction for fees will be onchain and published on relevant github issue.

If you've received DataCap allocation privileges before, please link to prior notary applications.

https://github.com/filecoin-project/notary-governance/issues/664

How are you connected to the Filecoin ecosystem? Describe your (or your organization's) Filecoin relationships, investments, or ownership.

I am myself a filplus notary, storage provider, data onboarder and preparer, PL employee and tooling developer

How are you estimating your client demand and pathway usage? Do you have existing clients and an onboarding funnel?

As a maintainer of the top data preparation tool, I have clients that are actively using the tool which helps onboarding their clients.