filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] <FogMeta Lab> - <Open Datasets on AWS - bioinformatics>[1/2] #1648

Open hengdingy opened 1 year ago

hengdingy commented 1 year ago

Data Owner Name

FogMeta Lab

Data Owner Country/Region

China

Data Owner Industry

Web3 / Crypto

Website

https://fogmeta.com

Social Media

Twitter: https://twitter.com/FogMeta
GitHub: https://github.com/FogMeta

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1gmkpkvrsxvwveyfe3c3y3xejw6flbdrowc5jv6i

Custom multisig

Identifier

No response

Share a brief history of your project and organization

FogMeta Lab's research spans multiple levels from system technology, infrastructure, and middleware to services and solutions, and involves future systems, network technology and business, distributed systems and management, information management, and interactive and innovative services. Based on the views on and practices in the industry, FogMeta also solves the problem of business complexity through operations optimization and other technologies.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

These are all open datasets of the same subset(the "bioinformatics" category) on AWS. Please refer to the link here: https://registry.opendata.aws/tag/bioinformatics/.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus, graphsplit, others/custom tool

If you answered "other/custom tool" in the previous question, enter the details here

We'd also like to use the Swan Client tool (https://github.com/filswan/go-swan-client#Graphsplit) to prepare the dataset.

Please share a sample of the data

1. 4D Nucleome (4DN)
s3://4dn-open-data-public/

2. Genome Aggregation Database (gnomAD)
s3://gnomad-public-us-east-1/

3. The Singapore Nanopore Expression Data Set
s3://sg-nex-data/

4. PubSeq - Public Sequence Resource
s3://pubseq-datasets/

5. Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)
s3://targetepigenomics/

6. Open Bioinformatics Reference Data for Galaxy
s3://biorefdata/

7. Basic Local Alignment Sequences Tool (BLAST) Databases
s3://ncbi-blast-databases/

8. Broad Genome References
s3://broad-references/

9. DNAStack COVID19 SRA Data
s3://dnastack-covid-19-sra-data/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Monthly

For how long do you plan to keep this dataset stored on Filecoin

2 to 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, Africa, North America, South America, Europe, Australia (continent), Antarctica

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), HTTP or FTP server, IPFS, Shipping hard drives, Others

How do you plan to choose storage providers

Slack, Partners, Others

If you answered "Others" in the previous question, what is the tool or platform you plan to use

We'd also like to use FilSwan platform (https://filswan.com/) to choose storage providers who meet our requirements.

If you already have a list of storage providers to work with, fill out their names and provider IDs below

The storage providers we'd like to work with are presented below. Some of them are from the FilSwan platform.
f01955033
f02029115
f03624
f010088
f02301
f08399
f02401
f01955030
f0187709
f01163272
f01402814
f01390330
f01225882
f0717969
f03223
f01395673
f01072221
f0143858
f01786736
f0836160
f032824
f01443744
f01871352
f01907556
f01955028
f01947280
f01946551
f02012951
f01970630
f0240185

How do you plan to make deals to your storage providers

Boost client, Lotus client, Others/custom tool

If you answered "Others/custom tool" in the previous question, enter the details here

Swan Client tool
https://github.com/filswan/go-swan-client

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

herrehesse commented 1 year ago

Dear Filecoin+ Github applicant,

We have noticed that some of you are submitting merged datacap requests for datasets that are already (partly) on the chain. While we appreciate your enthusiasm to contribute to the Filecoin network, we want to remind you that this behaviour may not be beneficial to the network in the long run. In fact, this behaviour has been questioned and discussed in issue #832 on the Filecoin notary-governance Github repository.

We encourage you to review the discussions in issue #832. It's important to ensure that your datacap requests are valid, necessary, and add value to the network. By doing so, you can help to maintain the integrity and sustainability of the Filecoin network.

You can find the link to issue #832 here: filecoin-project/notary-governance#832

Thank you for your understanding and cooperation.

Sunnyiscoming commented 1 year ago

I advice you open an application for a single dataset.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

1PiB

Client address

f1gmkpkvrsxvwveyfe3c3y3xejw6flbdrowc5jv6i

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1gmkpkvrsxvwveyfe3c3y3xejw6flbdrowc5jv6i

DataCap allocation requested

256TiB

Id

7e41b64c-4bab-4783-b191-97f935df8a52

Sunnyiscoming commented 1 year ago

Related proposal https://github.com/filecoin-project/notary-governance/issues/832 Hope more notaries review this application and comment on this proposal.

cryptowhizzard commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedhmbatuil7daofbkr5c4sf5jbhymldymlg7oky2usd5kxi2wktiq

Address

f1gmkpkvrsxvwveyfe3c3y3xejw6flbdrowc5jv6i

Datacap Allocated

256.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

7e41b64c-4bab-4783-b191-97f935df8a52

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedhmbatuil7daofbkr5c4sf5jbhymldymlg7oky2usd5kxi2wktiq

nj-steve commented 1 year ago

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1649#issuecomment-1468304580

nj-steve commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedxyy5ageim3duhr5wp2y2h32eeqne3g7a7ebrrbpuajxkclwefwe

Address

f1gmkpkvrsxvwveyfe3c3y3xejw6flbdrowc5jv6i

Datacap Allocated

256.00TiB

Signer Address

f1xx6555qijma7igpnjspyvdunc4vfxkawnpqy5ii

Id

7e41b64c-4bab-4783-b191-97f935df8a52

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedxyy5ageim3duhr5wp2y2h32eeqne3g7a7ebrrbpuajxkclwefwe

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

Joss-Hua commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

keep alive

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

keep live

github-actions[bot] commented 12 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Normalnoise commented 12 months ago

checker:manualTrigger

filplus-checker-app[bot] commented 12 months ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 11 months ago

checker:manualTrigger

filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01146045: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

aggregation-and-compliance-bot[bot] commented 11 months ago
Client f02085938 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Percent of used DataCap stored with top provider < 75 The percent of Data from the client that is stored with their top provider is 100%. This should be less than 75%
aggregation-and-compliance-bot[bot] commented 11 months ago
Client f02085938 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 14 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Percent of used DataCap stored with top provider < 75 The percent of Data from the client that is stored with their top provider is 100%. This should be less than 75%
hengdingy commented 11 months ago

checker:manualTrigger