joshua-ne / FIL_DC_Allocator_1022

This repo serves as a bookkeeping tool for DC allocation from Allocator_1022
0 stars 0 forks source link

[DataCap Application] <Xenogic> - <Xenogic027> #34

Open andy198811 opened 1 month ago

andy198811 commented 1 month ago

Version

1

DataCap Applicant

Xenogic

Project ID

Xenogic027

Data Owner Name

Google DeepMind

Data Owner Country/Region

United States

Data Owner Industry

IT & Technology Services

Website

https://deepmind.google/

Social Media Handle

https://x.com/GoogleDeepMind

Social Media Type

Twitter

What is your role related to the dataset

Data Preparer

Total amount of DataCap being requested

10 PiB

Expected size of single dataset (one copy)

2 PiB

Number of replicas to store

4

Weekly allocation of DataCap requested

700 TiB

On-chain address for first allocation

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

Data Type of Application

Public, Open Dataset (Research/Non-Profit)

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Xenogic is a team specializing in big data, active since 2022. We focus on exploring innovative commercial models in data collection, storage, processing, transfer, and application, aiming to harness the power of data in an era defined by AI and rapid technological advancements. With a commitment to redefining data-driven solutions, Xenogic strives to create impactful strategies that meet the evolving needs of modern businesses and industries, pushing the boundaries of how data can be utilized in a world of information and AI-driven transformation.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

The Kinetics datasets, developed by DeepMind, are large-scale collections of video clips designed for human action recognition research. They include various versions (e.g., Kinetics-400, Kinetics-600, and Kinetics-700) and cover hundreds of human action categories, making them invaluable for training and evaluating AI models in video analysis and behavior understanding. Secure storage is essential to maintain data integrity, enable reproducibility in research, and protect the availability of this critical resource for future advancements in machine learning and computer vision.

Where was the data currently stored in this dataset sourced from

Google Cloud

If you answered "Other" in the previous question, enter the details here

No response

If you are a data preparer. What is your location (Country/Region)

Hong Kong

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

The data is collected and made into car files by our homemade tool stacks, which have to proven to be accurate and efficient

If you are not preparing the data, who will prepare the data? (Provide name and business)

No response

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

AFAK it is not stored yet.

Please share a sample of the data

https://s3.amazonaws.com/kinetics/400/train/part_0.tar.gz

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

Permanently

In which geographies do you plan on making storage deals

Asia other than Greater China

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), HTTP or FTP server, IPFS, Shipping hard drives

How did you find your storage providers

Partners, Others

If you answered "Others" in the previous question, what is the tool or platform you used

No response

Please list the provider IDs and location of the storage providers you will be working with.

f01422327, Japan
f02252024, Japan
f02252023, Japan
f03230392, Vietnam
f01111110, Vietnam
f01909705, Vietnam
f03230423, Malaysia
f03232064, Malaysia
f03232134, Malaysia

How do you plan to make deals to your storage providers

Boost client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

datacap-bot[bot] commented 1 month ago

Application is waiting for allocator review

joshua-ne commented 4 weeks ago

Hi @andy198811, as we have done SP localization verification offline and sample checks, I have also checked online copies of your dataset. It seems to be a new set.

So I think we are good to go for now. But before we start with a test run with 50TiB Datacap, officially, can you confirm that all the SP's you are working with shall store the unsealed files of the sectors and support public retrieval, like SPARK?

andy198811 commented 4 weeks ago

We confirm all the SPs will store the hot copies of the sectors and support SPARK retrieval. Thanks!

datacap-bot[bot] commented 4 weeks ago

Datacap Request Trigger

Total DataCap requested

10 PiB

Expected weekly DataCap usage rate

700 TiB

DataCap Amount - First Tranche

50TiB

Client address

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

datacap-bot[bot] commented 4 weeks ago

DataCap Allocation requested

Multisig Notary address

Client address

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

DataCap allocation requested

50TiB

Id

61f7840a-e7ea-4f1e-bdcc-1a3beec290bb

datacap-bot[bot] commented 4 weeks ago

Application is ready to sign

datacap-bot[bot] commented 4 weeks ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedifhh4n5544djejhh6b4xuv4r6lxvxxaarziuh5clfomuewu5e74

Address

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

Datacap Allocated

50TiB

Signer Address

f1sfffys4o2w64rdpd3alpmvpvj4ik6x2iyjsjmry

Id

61f7840a-e7ea-4f1e-bdcc-1a3beec290bb

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedifhh4n5544djejhh6b4xuv4r6lxvxxaarziuh5clfomuewu5e74

datacap-bot[bot] commented 4 weeks ago

Application is Granted

datacap-bot[bot] commented 4 weeks ago

Client used 75% of the allocated DataCap. Consider allocating next tranche.

andy198811 commented 4 weeks ago

Hi thank you for your support. We have onboarded the 50TB DC with the following 4 SPs. Could you please continue to another round so that the SPs can continue sealing without stop? Thanks!

joshua-ne commented 4 weeks ago

checker:manualTrigger

datacap-bot[bot] commented 4 weeks ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

joshua-ne commented 4 weeks ago

trigger:run_retrieval_test method=lassie sp_list=f03230392,f03230423,f03232064,f03232134 client=f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq limit=10

myfil512 commented 4 weeks ago
miner_id retrieval_rate retrieval_success_counts retrieval_fail_counts
f03230392 0% 0 10
f03230423 0% 0 10
f03232064 0% 0 10
f03232134 0% 0 10
joshua-ne commented 4 weeks ago

Hi @andy198811 glad to see that you are onboarding data quickly. Though I can see that you have consumed all the 50TB Datacap, I can not see any of them active either with the official bot or our own bot. Maybe the sectors are not completed sealed or submitted yet?

I won't be able to sign the next round until I see decent retrieval from at least some of your SPs. Let's wait and see, I will trigger the checker bot every now and then.

joshua-ne commented 3 weeks ago

checker:manualTrigger

datacap-bot[bot] commented 3 weeks ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report.

joshua-ne commented 3 weeks ago

trigger:run_retrieval_test method=lassie sp_list=f03230392,f03230423,f03232064,f03232134 client=f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq limit=10

myfil512 commented 3 weeks ago
miner_id retrieval_rate retrieval_success_counts retrieval_fail_counts
f03230392 80% 8 2
f03230423 100% 10 0
f03232064 100% 10 0
f03232134 100% 10 0
joshua-ne commented 3 weeks ago

image

Everything looks good to me. Will continue to sign another 256TiB for you, since from our communication, the SPs working with this client are very DC-hungry now and they very a very TIGHT timeline for sealing as they are renting the sealing workers at a pretty high expense.

Since we are making some exceptions here, we will keep even closer look on the data/retrieval behavior of your SPs. Please make sure with your SPs.

joshua-ne commented 3 weeks ago

image

@andy198811 Though I am ready to sign, but it seems that the data is not synchronized to the bot yet, and it does not allow me to sign the next round. Maybe we shall wait a few days. Sorry for any inconvenience caused.

andy198811 commented 3 weeks ago

Thank you for your update! Hope it will be ready soon. Please let us know if you have ANY update.

joshua-ne commented 3 weeks ago

@andy198811 I just got an update from the fil-plus team. This is a known bug that will be fixed tomorrow, hopefully, I will be able to sign then. Stay tuned.

datacap-bot[bot] commented 3 weeks ago

Application is in Refill

joshua-ne commented 3 weeks ago

Signed. Happy sealing.

joshua-ne commented 3 weeks ago

trigger:run_retrieval_test method=lassie sp_list=f01422327,f02252024,f02252023,f03230392,f03230423,f03232064,f03232134 client=f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq limit=10

myfil512 commented 3 weeks ago
miner_id retrieval_rate retrieval_success_counts retrieval_fail_counts
f01422327 80% 8 2
f02252024 80% 8 2
f02252023 80% 8 2
f03230392 90% 9 1
f03230423 100% 10 0
f03232064 100% 10 0
f03232134 100% 10 0
joshua-ne commented 3 weeks ago

Everything looks good to me, I will continue to sign another 512TiB Datacap for you. Keep up.

datacap-bot[bot] commented 3 weeks ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceanm364cdbfymw67x5nge2v427s5wv34vyrhli5bqdw2pwxzmsjjg

Address

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

Datacap Allocated

512TiB

Signer Address

f1sfffys4o2w64rdpd3alpmvpvj4ik6x2iyjsjmry

Id

d92f9f64-6dcb-43d7-97a9-fb953574e4cf

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceanm364cdbfymw67x5nge2v427s5wv34vyrhli5bqdw2pwxzmsjjg

datacap-bot[bot] commented 3 weeks ago

Application is Granted

andy198811 commented 3 weeks ago

Thank you for timely support, this really help us a lot! We will keep working with SPs to ensure the retrievability.

And since we are reaching out with more SPs, we would like to add two more SPs to the list, please approve. Thanks!

f01111110, Vietnam f01909705, Vietnam

datacap-bot[bot] commented 3 weeks ago

Issue has been modified. Changes below:

(OLD vs NEW)

Please list the provider IDs and location of the storage providers you will be working with: f01422327, Japan f02252024, Japan f02252023, Japan f03230392, Vietnam f01111110, Vietnam f01909705, Vietnam f03230423, Malaysia f03232064, Malaysia f03232134, Malaysia vs f01422327, Japan f02252024, Japan f02252023, Japan f03230392, Vietnam f03230423, Malaysia f03232064, Malaysia f03232134, Malaysia State: ChangesRequested vs Granted

datacap-bot[bot] commented 2 weeks ago

Issue information change request has been approved.

datacap-bot[bot] commented 2 weeks ago

Client used 75% of the allocated DataCap. Consider allocating next tranche.

joshua-ne commented 2 weeks ago

trigger:run_retrieval_test method=lassie sp_list=f01422327,f02252024,f02252023,f03230392,f03230423,f03232064,f03232134 client=f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq limit=10

myfil512 commented 2 weeks ago
miner_id retrieval_rate retrieval_success_counts retrieval_fail_counts
f01422327 80% 8 2
f02252024 80% 8 2
f02252023 80% 8 2
f03230392 100% 10 0
f03230423 100% 10 0
f03232064 60% 6 4
f03232134 90% 9 1
joshua-ne commented 2 weeks ago

The retrieval looks okay overall, but f03232064 may need some optimization. Will support for another round.

datacap-bot[bot] commented 2 weeks ago

Application is in Refill

datacap-bot[bot] commented 2 weeks ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceduruwcgcj5nkl35sr5druh47vxtzojyizdcocv76adm6mh2faurk

Address

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

Datacap Allocated

1PiB

Signer Address

f1sfffys4o2w64rdpd3alpmvpvj4ik6x2iyjsjmry

Id

a7bcd0b5-03de-4d37-b040-311765f3a454

You can check the status here https://filfox.info/en/message/bafy2bzaceduruwcgcj5nkl35sr5druh47vxtzojyizdcocv76adm6mh2faurk

datacap-bot[bot] commented 2 weeks ago

Application is Granted

datacap-bot[bot] commented 2 weeks ago

Client used 75% of the allocated DataCap. Consider allocating next tranche.

joshua-ne commented 2 weeks ago

trigger:run_retrieval_test method=lassie sp_list=f01422327,f02252024,f02252023,f03230392,f03230423,f03232064,f03232134 client=f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq limit=10

myfil512 commented 2 weeks ago
miner_id retrieval_rate retrieval_success_counts retrieval_fail_counts
f01422327 30% 3 7
f02252024 40% 4 6
f02252023 50% 5 5
f03230392 100% 10 0
f03230423 100% 10 0
f03232064 70% 7 3
f03232134 80% 8 2
joshua-ne commented 2 weeks ago

@andy198811 It seems the retrieval rates of f01422327 and f02252024 are not good. Are you still in the process of sealing? Maybe we will wait longer to see they improve before next round of signing.

joshua-ne commented 1 week ago

trigger:run_retrieval_test method=lassie sp_list=f01422327,f02252024,f02252023,f03230392,f03230423,f03232064,f03232134 client=f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq limit=10

myfil512 commented 1 week ago
miner_id retrieval_rate retrieval_success_counts retrieval_fail_counts
f01422327 100% 10 0
f02252024 100% 10 0
f02252023 100% 10 0
f03230392 100% 10 0
f03230423 100% 10 0
f03232064 100% 10 0
f03232134 100% 10 0
joshua-ne commented 1 week ago

All retrieval looks perfect this round, will support with another round. Please keep this pattern!

joshua-ne commented 1 week ago

Hi can you please submit your dataset card by the end of next week. Thank you. This will be mandatory for the following allocation of datacap.

https://github.com/joshua-ne/FIL_DC_Allocator_1022_Dataset_Card/issues/new/choose

datacap-bot[bot] commented 1 week ago

Application is in Refill

datacap-bot[bot] commented 1 week ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacec3ditbw7sq6vy4mmlv5asv4aaa7mayuv47nfqkeuvdp7njj6jnl2

Address

f1bbppztpteepgaz46dyzvpwr7ajnqq7gvhlba6rq

Datacap Allocated

1PiB

Signer Address

f1sfffys4o2w64rdpd3alpmvpvj4ik6x2iyjsjmry

Id

757afb0d-fd65-478d-a169-3bca1cdcf6b0

You can check the status here https://filfox.info/en/message/bafy2bzacec3ditbw7sq6vy4mmlv5asv4aaa7mayuv47nfqkeuvdp7njj6jnl2