filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Filecoin Deal Auctions #61

Closed andrewxhill closed 1 year ago

andrewxhill commented 3 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Our project began with our initial ambition to build [Powergate](https://github.com/textileio/powergate) and integrate it into our hosted platform, [the Hub](https://blog.textile.io/the-textile-hub-joins-filecoin-mainnet/), plus help other users integrate it for using Filecoin. We found that it was difficult to make a high number of deals on the network at competitive rates with minimal error rates. To solve those challenges and enable higher throughput deal-making on the network, we [announced our work on a deal auction layer for Filecoin](https://blog.textile.io/introducing-storage-auctions-filecoin/).

Auctions are an exciting primitive for the network because they inverse much of the deal-making flow. First, clients submit deal proposals as auctions. Next, storage providers bid in real-time for the right to store that deal. Finally, winners are selected according to a simple and open algorithm, and the deals are made. This simple three-step flow dramatically reduces the complexity of deal-making, reduces errors, and increases throughput on the network. Additionally, it's done with minimal impact on storage providers, making it easy to tune their bidding to match their infrastructure capabilities. 

Since launch, our deal auction system has stored 120TiB on the network across 32 storage providers. We've [opened the system metrics](https://textileio.retool.com/embedded/public/fbf59411-760a-4a1a-b5b8-43f42061685d) for all to review in real-time. 

We've made our deals using datacap from a series of smaller applications you can find in [GitHub history](https://github.com/filecoin-project/filecoin-plus-client-onboarding/issues?q=is%3Aissue+is%3Aclosed+author%3Aandrewxhill).

Our vision is to continue pushing our ability to add throughput and stability to the network through deal auctioning. While they are currently in prototype, we believe we can turn them into a decentralized building block on the Filecoin network.   

What is the primary source of funding for this project?

Textile has received both VC investment and foundation grants in the past to help build this and other projects. 

What other projects/ecosystem stakeholders is this project associated with?

Auctions can be used all or in part by other projects. Right now, three clients are at varying stages of onboarding, including a collaborations with web3.storage, Opscientia, and the eth.storage bridge. We've recently released an [auction client](https://github.com/textileio/go-auctions-client) that allows clients to sign their own deals and use their own datacap. Most if not all of the deals made by the listed clients will migrate to using their own datacap over time. In the interim, and whenever we onboard new auctions users, we aim to have datacap available for rapid onboarding while they learn how to use the system. 

Use-case details

Describe the data being stored onto Filecoin

There is varying data being stored through auctions including primarily NFT assets, public and research datasets.

Where was the data in this dataset sourced from?

We don't do the sourcing directly, so the auction clients are doing sourcing through their collaborations and APIs. So far, the largest user of auctions has been the web3.storage team. Others are in the early phases of onboarding to the system.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

You can use our public metrics dashboard to explore all data being stored. 

* https://textileio.retool.com/embedded/public/fbf59411-760a-4a1a-b5b8-43f42061685d
* Use the provider search at the bottom to view all storage records with any provider. e.g. https://textileio.retool.com/embedded/public/46e74cd2-c47c-42ac-b542-189925795c41#provider=f020378

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

We will only onboard clients to the auctioneer (using our datacap) that have public datasets. In cases where private data are being stored, we will require them to use their own wallet and therefor apply for their own datacap.

What is the expected retrieval frequency for this data?

Varying. Some projects such as Opscientia are actively working to understand the data architecture requirements on Filecoin and begin experimentation with retrieval. 

For how long do you plan to keep this dataset stored on Filecoin?

Varying. Again, this is something that each client of the auctions system will address in their own plans. We will work with them to enable future storage renewals and deal monitoring.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We don't favor one geography over another. 

https://textileio.retool.com/embedded/public/fbf59411-760a-4a1a-b5b8-43f42061685d

How will you be distributing your data to storage providers? Is there an offline data transfer process?

There is an offline data transfer, but only because it allows data transfer to be "pull" based from the storage provider's perspective. This means, that when they win an auction they can fetch data on demand. As the online deal flow becomes more flexible, we'll migrate to that API.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Provider selection:

Providers are selected by a simple algorithm per replica. For example, if you store a file with replication = 3:

1. Select all cheapest bids that match deal requirements (e.g. fast-retrieval).
2. For the first replica, choose the provider from the pool (step 1) with the highest reputation. Reputation is a time-decay function over a week of failures. So the provider that has the least recent deal failures will win.
3. For the second replica, choose the provider that has won an auction the least in a rolling window of one week. This ensures newly active or unlucky providers make it into the winning pool.
4. For the third replica, choose a provider at random.

Any provider can join the bidding pool by running [bidbot](https://github.com/textileio/bidbot).

Data security and retrieval:
* All data is stored in replica (we recommend 5 to clients).
* By default, all deals include fast-retrieval.
* We will explore adding retrieval reputation to future selection algorithms.

How will you be distributing deals across storage providers?

See above. Bidbot plus winner algorithm. 
I believe we are among the most decentralized (in terms of provider choice) clients on the network today. See the first pie-chart here https://textileio.retool.com/embedded/public/fbf59411-760a-4a1a-b5b8-43f42061685d for a sense of our distribution across connected providers.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes.
large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

MegTei commented 3 years ago

Happy to support

cryptowhizzard commented 3 years ago

Same here. Happy to support

dannyob commented 3 years ago

Happy to support, if it's needed.

Reiers commented 3 years ago

Awesome project - Count me in ! +1

galen-mcandrew commented 3 years ago

Multisig Notary requested

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

45TiB

large-datacap-requests[bot] commented 3 years ago

**Multisig created and sent to RKH f01381730

cryptowhizzard commented 3 years ago

I tried to approve, but it is not visible in my list to approve. Please check @galen-mcandrew

MegTei commented 3 years ago

Same issue here @cryptowhizzard

XnMatrixSV commented 3 years ago

I'll support, if it's needed.

Reiers commented 3 years ago

cc: @galen-mcandrew

dkkapur commented 3 years ago

The multisig still needs to be granted Notary status - will follow up with them.

andrewxhill commented 3 years ago

Any update here @dkkapur? We're about to hit the bottom again and it would be amaz to use this instead of an interim small app again.

large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 3 years ago

DataCap Allocation requested

Multisig Notary address

f01381730

Client address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

DataCap allocation requested

22.5TiB

dannyob commented 3 years ago

Hey @dkkapur , @galen-mcandrew, we can't sign this, because the bot's message has single backticks around the Client address, which is breaking the file.app feature. If someone can edit the bot comment to get rid of that (or fix the root cause that created it), we can go ahead and approve.

dannyob commented 3 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacediwo5rmsigow34zi4mdzapdzgcdwvxrwdepwinpvxbfs3swt4pg4

Address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

Datacap Allocated

22.5TiB

Signer Address

f1k6wwevxvp466ybil7y2scqlhtnrz5atjkkyvm4a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacediwo5rmsigow34zi4mdzapdzgcdwvxrwdepwinpvxbfs3swt4pg4

dannyob commented 3 years ago

Yeah, that's the problem -- I went in with a debugger on my client and removed the backticks, and it worked fine.

s0nik42 commented 3 years ago

so what should we do ?

large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

galen-mcandrew commented 3 years ago

I edited the parent comment and the bot comment, seeing it correctly in the app now. Should be good to sign @s0nik42

Thanks @dannyob and @MegTei for troubleshooting!

andrewxhill commented 3 years ago

Thank you all for the quick action!

s0nik42 commented 3 years ago

Got an error after signing with the WAllet. But disappear to fast. Something like blockstore not found Then I tried a second time and get : image

cryptowhizzard commented 3 years ago

Same here as @s0nik42 @galen-mcandrew

cryptowhizzard commented 3 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedmgmmdxgux7gw276v2jkvwx3jd6wwq6t3yebkapdurka764y6fu6

Address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

Datacap Allocated

22.5TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedmgmmdxgux7gw276v2jkvwx3jd6wwq6t3yebkapdurka764y6fu6

galen-mcandrew commented 3 years ago

Looks like we hit 2 signatures, errors may have been people trying to sign at same time...

andrewxhill commented 3 years ago

Hey all, we are about 1.5 days away from maxing this out. I'd like to request the next allotment (it looks like 200% is our limit on 2nd, so 90TiB ideally.

As always, you can find the full history of our deal making here:

https://textileio.retool.com/embedded/public/3d22aabf-1e60-455f-8fd8-49994638e893

https://textileio.retool.com/embedded/public/fbf59411-760a-4a1a-b5b8-43f42061685d

We are also aiming to onboard even more storage providers via https://blog.textile.io/win-fil-250-with-filecoin-auctions/ to spread the storage even more horizontally.

cryptowhizzard commented 3 years ago

I don't know if you need support for this, but if you do, i am happy to support you!

andrewxhill commented 3 years ago

thanks @cryptowhizzard. bumping for others. we've got ~24hrs left now.

Reiers commented 3 years ago

Yeah, I'm in +1. Ready to sign

andrewxhill commented 3 years ago

@galen-mcandrew in the documentation here (https://github.com/filecoin-project/filecoin-plus-large-datasets#granting-datacap-to-the-client) it says:

Specifically, once the client asks for additional DataCap to be granted - in the form of a comment on their application issue, the lead Notary is responsible for pasting in a message of the following format:

## DataCap Allocation requested
#### Multisig Notary address
> <addr1>
#### Client address
> <addr2>
#### DataCap allocation requested
> XTiB

i can't tell if that is technically true. if it is, i can't tell who is the lead notary to know who i need to bug :D

andrewxhill commented 3 years ago

Live tracker: about 3 hours left until our project runs out of the first allotment.

galen-mcandrew commented 3 years ago

DataCap Allocation requested

Multisig Notary address

f01381730

Client address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

DataCap allocation requested

90TiB

Reiers commented 3 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebov6wr3rot5n5puwrtbgss7pfobu2dydpjqykghkdvkkf2c3yaqc

Address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

Datacap Allocated

90TiB

Signer Address

f1oz43ckvmtxmmsfzqm6bpnemqlavz4ifyl524chq

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebov6wr3rot5n5puwrtbgss7pfobu2dydpjqykghkdvkkf2c3yaqc

galen-mcandrew commented 3 years ago

Despite the above comment reading as 90TiB, that message actually cleared an old pending transaction (TxID 2) which was for 25 TiB. This was the result of three people attempting to sign the previous allocation, which cleared after two and then started a new transaction. At this time, f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi has 25TiB.

Going to leave the label and posted comments, hoping to get two other notaries to sign this new 90TiB allocation request.

cryptowhizzard commented 3 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedavhl4hkjlrn6lf5keugkwxd2uhnuhnsxqb3eytrmhlgdzjyfogg

Address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

Datacap Allocated

90TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedavhl4hkjlrn6lf5keugkwxd2uhnuhnsxqb3eytrmhlgdzjyfogg

MegTei commented 3 years ago

I know I have been an approver previously, have jumped on to attempt to verify the request but it seemed to fail on my side, it may have gone thru in the back end. let me know

MegTei commented 3 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceajt7vfte3v5iyikspqvkf7yfsan5wqvssp5d2lnukvjzqfbwveom

Address

f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi

Datacap Allocated

90TiB

Signer Address

f1ystxl2ootvpirpa7ebgwl7vlhwkbx2r4zjxwe5i

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceajt7vfte3v5iyikspqvkf7yfsan5wqvssp5d2lnukvjzqfbwveom

andrewxhill commented 2 years ago

Hey all! We're still trucking along. We are ready for the next allocation, 90TiB. We are about 2-3 days left on our existing datacap. I'd love you help notaries!!!

cryptowhizzard commented 2 years ago

@galen-mcandrew can you provide the new tranche for us to sign?

andrewxhill commented 2 years ago

thanks @cryptowhizzard! i think we've got 12-18hrs left on datacap for the project

andrewxhill commented 2 years ago

oop, actually more like 8-12hrs

cryptowhizzard commented 2 years ago

@dkkapur

andrewxhill commented 2 years ago

Pausing the project...

s0nik42 commented 2 years ago

I will sign it in 5min

s0nik42 commented 2 years ago

Actually, I would love to sign it but I don't see on plus.fil.org

andrewxhill commented 2 years ago

maybe i needed the specific format?

s0nik42 commented 2 years ago

No clue, some action from @galen-mcandrew to create the "DataCap Allocation requested" maybe