filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] <FogMeta Lab> - <Open Datasets on AWS - bioinformatics>[2/2] #1649

Closed hengdingy closed 10 months ago

hengdingy commented 1 year ago

Data Owner Name

FogMeta Lab

Data Owner Country/Region

China

Data Owner Industry

Web3 / Crypto

Website

https://fogmeta.com

Social Media

Twitter: https://twitter.com/FogMeta
GitHub: https://github.com/FogMeta

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1plxjm7w6xiu5x7xqb6k4aysuvwj5qxyogzsy3gy

Custom multisig

Identifier

No response

Share a brief history of your project and organization

FogMeta Lab's research spans multiple levels from system technology, infrastructure, and middleware to services and solutions, and involves future systems, network technology and business, distributed systems and management, information management, and interactive and innovative services. Based on the views on and practices in the industry, FogMeta also solves the problem of business complexity through operations optimization and other technologies.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

These are all open datasets of the same subset(the "bioinformatics" category) on AWS. Please refer to the link here: https://registry.opendata.aws/tag/bioinformatics/.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus, graphsplit, others/custom tool

If you answered "other/custom tool" in the previous question, enter the details here

We'd also like to use the Swan Client tool (https://github.com/filswan/go-swan-client#Graphsplit) to prepare the dataset.

Please share a sample of the data

1. 4D Nucleome (4DN)
s3://4dn-open-data-public/

2. Genome Aggregation Database (gnomAD)
s3://gnomad-public-us-east-1/

3. The Singapore Nanopore Expression Data Set
s3://sg-nex-data/

4. PubSeq - Public Sequence Resource
s3://pubseq-datasets/

5. Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)
s3://targetepigenomics/

6. Open Bioinformatics Reference Data for Galaxy
s3://biorefdata/

7. Basic Local Alignment Sequences Tool (BLAST) Databases
s3://ncbi-blast-databases/

8. Broad Genome References
s3://broad-references/

9. DNAStack COVID19 SRA Data
s3://dnastack-covid-19-sra-data/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Monthly

For how long do you plan to keep this dataset stored on Filecoin

2 to 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, Africa, North America, South America, Europe, Australia (continent), Antarctica

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), HTTP or FTP server, IPFS, Shipping hard drives, Others

How do you plan to choose storage providers

Slack, Partners, Others

If you answered "Others" in the previous question, what is the tool or platform you plan to use

We'd also like to use FilSwan platform (https://filswan.com/) to choose storage providers who meet our requirements.

If you already have a list of storage providers to work with, fill out their names and provider IDs below

The storage providers we'd like to work with are presented below. Some of them are from the FilSwan platform.
f01955033
f02029115
f03624
f010088
f02301
f08399
f02401
f01955030
f0187709
f01163272
f01402814
f01390330
f01225882
f0717969
f03223
f01395673
f01072221
f0143858
f01786736
f0836160
f032824
f01443744
f01871352
f01907556
f01955028
f01947280
f01946551
f02012951
f01970630
f0240185

How do you plan to make deals to your storage providers

Boost client, Lotus client, Others/custom tool

If you answered "Others/custom tool" in the previous question, enter the details here

Swan Client tool
https://github.com/filswan/go-swan-client

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

herrehesse commented 1 year ago

Dear Filecoin+ Github applicant,

We have noticed that some of you are submitting merged datacap requests for datasets that are already (partly) on the chain. While we appreciate your enthusiasm to contribute to the Filecoin network, we want to remind you that this behaviour may not be beneficial to the network in the long run. In fact, this behaviour has been questioned and discussed in issue #832 on the Filecoin notary-governance Github repository.

We encourage you to review the discussions in issue #832. It's important to ensure that your datacap requests are valid, necessary, and add value to the network. By doing so, you can help to maintain the integrity and sustainability of the Filecoin network.

You can find the link to issue #832 here: filecoin-project/notary-governance#832

Thank you for your understanding and cooperation.

Sunnyiscoming commented 1 year ago

I advice you open an application for a single dataset.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

1PiB

Client address

f1plxjm7w6xiu5x7xqb6k4aysuvwj5qxyogzsy3gy

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1plxjm7w6xiu5x7xqb6k4aysuvwj5qxyogzsy3gy

DataCap allocation requested

256TiB

Id

3d793595-336c-4618-8dda-e411db827916

Sunnyiscoming commented 1 year ago

Related proposal https://github.com/filecoin-project/notary-governance/issues/832 Hope more notaries review this application and comment on this proposal.

cryptowhizzard commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceawpzj3t7cdq453wccthqkwgii5cgibzhmtngylk6mnuppw6hu4gm

Address

f1plxjm7w6xiu5x7xqb6k4aysuvwj5qxyogzsy3gy

Datacap Allocated

256.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

3d793595-336c-4618-8dda-e411db827916

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceawpzj3t7cdq453wccthqkwgii5cgibzhmtngylk6mnuppw6hu4gm

cryptowhizzard commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceap7hyqca7ccace4cbofrdy7xr5634sep4ic6d2wjrgzno6dey2vc

Address

f1plxjm7w6xiu5x7xqb6k4aysuvwj5qxyogzsy3gy

Datacap Allocated

256.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

3d793595-336c-4618-8dda-e411db827916

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceap7hyqca7ccace4cbofrdy7xr5634sep4ic6d2wjrgzno6dey2vc

nj-steve commented 1 year ago

@hengdingy Hello , please tell us how much data you have downloaded ?

hengdingy commented 1 year ago

@hengdingy Hello, please tell us how much data you have downloaded ?

@nj-steve we are downloading the following dataset, and 1Gbps bandwidth is available for now, we have been downloading those data and processing them to CAR files:

1. 4D Nucleome (4DN)
s3://4dn-open-data-public/

2. Genome Aggregation Database (gnomAD)
s3://gnomad-public-us-east-1/

3. The Singapore Nanopore Expression Data Set
s3://sg-nex-data/

4. PubSeq - Public Sequence Resource
s3://pubseq-datasets/

5. Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)
s3://targetepigenomics/

6. Open Bioinformatics Reference Data for Galaxy
s3://biorefdata/

7. Basic Local Alignment Sequences Tool (BLAST) Databases
s3://ncbi-blast-databases/

8. Broad Genome References
s3://broad-references/

9. DNAStack COVID19 SRA Data
s3://dnastack-covid-19-sra-data/
nj-steve commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacearicsi4fejygtsvgx2zwj6p4w54ycil6l7nxyqqpicmkdrd2euss

Address

f1plxjm7w6xiu5x7xqb6k4aysuvwj5qxyogzsy3gy

Datacap Allocated

256.00TiB

Signer Address

f1xx6555qijma7igpnjspyvdunc4vfxkawnpqy5ii

Id

3d793595-336c-4618-8dda-e411db827916

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacearicsi4fejygtsvgx2zwj6p4w54ycil6l7nxyqqpicmkdrd2euss

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

Normalnoise commented 1 year ago

checker:manualTrigger

hengdingy commented 1 year ago

checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

hengdingy commented 1 year ago

checker:manualTrigger

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

hengdingy commented 1 year ago

checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Normalnoise commented 1 year ago

checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

keep alive

Normalnoise commented 1 year ago

checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 1 year ago

checker:manualTrigger

github-actions[bot] commented 12 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Normalnoise commented 11 months ago

checker:manualTrigger

filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

hengdingy commented 11 months ago

keep live

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Normalnoise commented 11 months ago

checker:manualTrigger

filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

-- Commented by Stale Bot.