filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Solana Lab. - store snapshots in Filecoin #923

Closed pieceofr closed 11 months ago

pieceofr commented 2 years ago

name: Large Dataset Notary application about: Clients should use this application form to request a DataCap allocation via a LDN for a dataset title: "[DataCap Application] Solana Lab. - store snapshots in Filecoin" labels: 'application, Phase: Diligence' assignees: ''


Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Solana is a public L1 blockchain. 
Here is the official history page: https://docs.solana.com/history
The project is to store solana snapshots it has produced and the snapshots it will produce in Filecoin network.

What is the primary source of funding for this project?

Solana Lab.

What other projects/ecosystem stakeholders is this project associated with?

solana ecosystem: https://solana.com/ecosystem

Use-case details

Describe the data being stored onto Filecoin

Snapshots are compressed tar archives which store a copy of solana-accounts at a given slot.
Solana warehouse node generates snapshots hourly and in the end of epoch it produces a folder  contains all snapshots of a epoch period.

Where was the data in this dataset sourced from?

A special kind of solana validator called warehouse node produces snapshots. 
The warehouse node generates snapshots and uploads them to a storage every solana-epoch time.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

If you have aws account. 
https://s3.console.aws.amazon.com/s3/buckets/filecoin-snapshot-test?region=ap-southeast-1&tab=objects
download address:
s3://filecoin-snapshot-test

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes. This is a public dataset.

What is the expected retrieval frequency for this data?

less than  5 times per month.

For how long do you plan to keep this dataset stored on Filecoin?

We plan to keep dataset as long as possible unless unavoidable situation. For example, cost of storage is unaffordable.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Diverse regions and countries are prefered. 

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Prefer online process. 

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We plan to have 5 replicas and plan to find 5 providers. 

How will you be distributing deals across storage providers?

We are trying possible methods with 2 providers and will find more providers.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Solana Lab. will fund the project to store solana snapshots in Filecoin network.
Onboarding documents have been received from Filecoin experts.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

flyworker commented 2 years ago

As a notary of North America, we want to know if the issue create by Solona core team or ecosystem? Thanks

newwebgroup commented 2 years ago

If in the future you support offline transmission, we are willing to save a replica for you in Hong Kong or Guangdong, China.

pieceofr commented 2 years ago

As a notary of North America, we want to know if the issue create by Solona core team or ecosystem? Thanks Forgot to quote. This application is from Solana Labs, not eco-system.

flyworker commented 2 years ago

If in the future you support offline transmission, we are willing to save a replica for you in Hong Kong or Guangdong, China.

Thanks, FilSwan is going to support it.

llifezou commented 2 years ago

We have had technical exchanges with the Solana team about storing data to Filecoin, we can attest to the authenticity of this application, we support it! Origin Storage

Chris00618 commented 2 years ago

It's meaningless to store the snapshot data of Solana node and back it up to filecoin. For any blockchain system, thousands of backups already exist. For example, can each Filecoin SP use the node snapshot to apply for datacap? This is obviously inappropriate.

In addition, as the beneficial SPs, please do not use the power of notaries here to give support. This is the rule you must follow.

flyworker commented 2 years ago
  1. this request is from Solana lab not from SP. I have done the cross-check on the Solana side they also have a issue on there side And snapshot is high value, Solana generates lots of data perday ,which is exactly why they need Filecoin storage. I don't think your comments are valid.
  2. It is part of the notaries work to support useful data onboarding https://github.com/filecoin-project/notary-governance/tree/main/notaries

    The base responsibilities of the Notaries are as follows:
    
    Allocate DataCap to clients in order to subsidize reliable and useful storage on the network.
    Verify that clients receive a DataCap commensurate with the level of trust that is warranted based on information provided.
    Ensure that in the allocation of the DataCap no party is given excessive trust in any form that might jeopardize the network.
    Follow operational guidelines, keep record of decision flow, and respond to any requests for audits of their allocation decisions.

It's meaningless to store the snapshot data of Solana node and back it up to filecoin. For any blockchain system, thousands of backups already exist. For example, can each Filecoin SP use the node snapshot to apply for datacap? This is obviously inappropriate.

In addition, as the beneficial SPs, please do not use the power of notaries here to give support. This is the rule you must follow.

pieceofr commented 2 years ago

"For any blockchain system, thousands of backups already exist" Solana validators/nodes only have most recent snapshots in their nodes unless they want to upload it to somewhere (ie, centralized cloud/Filecoin) I think few validators have motivation to do it. Solana Labs. do it for our ecosystem/users.

It's meaningless to store the snapshot data of Solana node and back it up to filecoin. For any blockchain system, thousands of backups already exist. For example, can each Filecoin SP use the node snapshot to apply for datacap? This is obviously inappropriate.

In addition, as the beneficial SPs, please do not use the power of notaries here to give support. This is the rule you must follow.

UnionLabs2020 commented 2 years ago

Hi @pieceofr , Currently how many replicas of this dataset are there? and how do you store it?

pieceofr commented 2 years ago

Hi @UnionLans2020, We have about 6-7 full-records buckets in different centralized cloud services.
Those buckets are not replica and I think several replicas behind each of buckets hided by cloud services.

Hi @pieceofr , Currently how many replicas of this dataset are there? and how do you store it?

flyworker commented 2 years ago

Please let us know if you need anyhelp to prepare car file or sending out deals. FilSwan has the full set of tools to support blockchain partners.

raghavrmadya commented 2 years ago

Datacap Request Trigger

Total DataCap requested

3.4PiB

Expected weekly DataCap usage rate

50TiB

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

DataCap allocation requested

25TiB

kernelogic commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceaoix32jkouizsnflmhsdidyafayweet2wpw5gefal32e3cephptc

Address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Datacap Allocated

25.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceaoix32jkouizsnflmhsdidyafayweet2wpw5gefal32e3cephptc

cryptowhizzard commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebq5fkctqvgjngedwsqiv3md6arru6tuhityh5p65dbnjtr7yjddg

Address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Datacap Allocated

25.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebq5fkctqvgjngedwsqiv3md6arru6tuhityh5p65dbnjtr7yjddg

BDE-io commented 2 years ago

@pieceofr Hi! Great to see you have gotten approval for DataCap. If you are looking for more storage providers to store these data or have any questions, please visit #bigdata-exchange on Filecoin Slack or reply here.

We have strong demand from a diverse group of SPs, who are actively looking to onboard more data.

pieceofr commented 2 years ago

Thanks you all !

Sunnyiscoming commented 1 year ago

Could you send an email to filplus-app-review@fil.org with your official domain in order to confirm your identity? Email name should includes the issue id #923.

Sunnyiscoming commented 1 year ago

Any update here?

pieceofr commented 1 year ago

Any update here?

I have sent you the email you to confirm identity.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f01858410

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

DataCap allocation requested

50TiB

Id

02dae57a-10c7-4dcf-b3b1-31feee6d125d

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Last two approvers

cryptowhizzard & kernelogic

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

50TiB

Total DataCap granted for client so far

25TiB

Datacap to be granted to reach the total amount requested by the client (3.4 PiB)

3.37PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
477 3 25TiB 33.33 768GiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01852325 Hong Kong, Central and Western, HK
BIH-Global Internet Harbor
4.97 TiB 33.62% 4.94 TiB 0.63%
f01938223 Montréal, Quebec, CA
eStruxture Data Centers Inc.
4.84 TiB 32.77% 4.81 TiB 0.65%
f01852677 Morrisville, North Carolina, US
TierPoint, LLC
4.97 TiB 33.62% 4.94 TiB 0.63%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
128.00 GiB 256.00 GiB 2 1.69%
4.81 TiB 14.53 TiB 3 98.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

cryptowhizzard commented 1 year ago

hi @piecefr

It seems f01985745 is not available on the internet and does not have an IP adres set. First retrieval attempt failed. Let us know when your SP's are ready to support retrieval.

cryptowhizzard commented 1 year ago

@pieceofr

Again, when will your SP's support retrieval?

frank-ang commented 1 year ago

Hi @cryptowhizzard , the SP reported they have fixed their retrieval issue with boost. Could you please try again?

cryptowhizzard commented 1 year ago

proposals dealscanner-f01930008-f01985745-24265306: D3KooWGJpT1vdLktshKZRWrA3Apw1eKBxwjPfoJF64AtEwdKrq-12D3KooWNLmQsTamPvVVLNFwJK5yfA1djs498ZyETEAMNE1KS4Kt-1676466301701219439 failed to transfer data: channel 12D3KooWGJpT1vdLktshKZRWrA3Apw1eKBxwjPfoJF64AtEwdKrq-12D3KooWNLmQsTamPvVVLNFwJK5yfA1djs498ZyETEAMNE1KS4Kt-1676466301701219439: graphsync request failed to complete: remote peer is missing block (QmSBtRFGT51Q6Ra5G4EXnPgJ9w7Vw1yvS6WcBmsEZTrWbZ) at path Links/0/Hash/Links/11/Hash/Links/544/Hash

kernelogic commented 1 year ago

I would like to sign again but I signed last round. I'll leave the opportunity to another notary.

cryptowhizzard commented 1 year ago

Yup, retrieval is working now. Data looks ok. Thanks!

cryptowhizzard commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecqfxeqbtehpex5wfataipk32b4py2k2hnowiiefcjdgqvq4klpye

Address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Datacap Allocated

50.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

02dae57a-10c7-4dcf-b3b1-31feee6d125d

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecqfxeqbtehpex5wfataipk32b4py2k2hnowiiefcjdgqvq4klpye

liyunzhi-666 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

liyunzhi-666 commented 1 year ago

Hey @pieceofr keep going Good report on the CID checker, I can support this round.

liyunzhi-666 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebdbpezhnuw54ochdctmpi3s5cwhdeflj5vuw5ie3rxjpydkkrin4

Address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Datacap Allocated

50.00TiB

Signer Address

f1pszcrsciyixyuxxukkvtazcokexbn54amf7gvoq

Id

02dae57a-10c7-4dcf-b3b1-31feee6d125d

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebdbpezhnuw54ochdctmpi3s5cwhdeflj5vuw5ie3rxjpydkkrin4

Tom-OriginStorage commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

DataCap allocation requested

100TiB

Id

3cebc91d-cbdc-453e-b2c0-36fe0703b2f9

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

4547.5YiB

Datacap to be granted to reach the total amount requested by the client (3.4 PiB)

-5.49B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
942 6 50TiB 33.86 15.40TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 62.63% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

SuperChaiChai commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedvlxajsktk6zreqnilv5zhbpw6ykraqoukhzhcgoasj64nr4jsia

Address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Datacap Allocated

100.00TiB

Signer Address

f12mckci3omexgzoeosjvstcfxfe4vqw7owdia3da

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedvlxajsktk6zreqnilv5zhbpw6ykraqoukhzhcgoasj64nr4jsia

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

DataCap allocation requested

100TiB

Id

0214f318-10cb-466e-bf94-7e8ebe6b8938

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

4547.5YiB

Datacap to be granted to reach the total amount requested by the client (3.4 PiB)

-5.49B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
942 6 50TiB 33.86 1.43TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

luobin544 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceaamelskhsdrg5iqswb4lxjokz3phcegy2pmdz765wpjxlkkquktg

Address

f1xw7jxdldohcdu4ec4a5mcnbtvutyjptawojmuzy

Datacap Allocated

100.00TiB

Signer Address

f1tbd632f6w62glfaf7wjpimacbnjiz26poyoes2q

Id

0214f318-10cb-466e-bf94-7e8ebe6b8938

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceaamelskhsdrg5iqswb4lxjokz3phcegy2pmdz765wpjxlkkquktg