filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Kernelogic - Open datasets onboarding initiative phase 1 (3/4) #1639

Closed kernelogic closed 8 months ago

kernelogic commented 1 year ago

Data Owner Name

Kernelogic

Data Owner Country/Region

Canada

Data Owner Industry

Life Science / Healthcare

Website

https://singularity-browser.kernelogic.ca

Social Media

N/A

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka

Custom multisig

Identifier

No response

Share a brief history of your project and organization

I have participated every Slingshot phase and is probably the best performing as a "small individual client". 

Even though Slingshot v2 has ended, there are still strong demand from SPs to onboard useful data. This application is to onboard open dataset from AWS.

I have a web UI (https://singularity-browser.kernelogic.ca/) to index all files onboarded and provide ways to retrieve.

I have successfully completed a few LDNs on other datasets and I have record to show I have been following the rules of decentralization and have zero self dealing.

Some of the recent LDNs I completed:
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1108
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1107
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1106
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1104
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/983

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

Storage working groups, BigD exchange, singularity deal making tool.

Describe the data being stored onto Filecoin

Because each LDN requires a separate client address in order for the bot to work properly, in order to onboard more data more smoothly, I am kicking off a series of various open dataset onboarding LDNs to onboard new AWS open datasets that I have not done before. Including but not limited to:

Allen Mouse Brain Atlas
Community Earth System Model Large Ensemble (CESM LENS)
Community Earth System Model v2 Large Ensemble (CESM2 LENS)
Epoch of Reionization Dataset
HIRLAM Weather Model
NIH NCBI Sequence Read Archive (SRA) on AWS
NOAA Global Ensemble Forecast System (GEFS)
NOAA Fundamental Climate Data Records (FCDR)
NOAA Joint Polar Satellite System (JPSS)

All these datasets will be indexed for easy lookup through my website https://singularity-browser.kernelogic.ca

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

singularity

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/allen-mouse-brain-atlas/
https://registry.opendata.aws/ncar-cesm-lens/
https://registry.opendata.aws/epoch-of-reionization/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Sporadic

For how long do you plan to keep this dataset stored on Filecoin

1 to 1.5 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, Europe

How will you be distributing your data to storage providers

HTTP or FTP server

How do you plan to choose storage providers

Slack, Big data exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

No response

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

github-actions[bot] commented 12 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 12 months ago

Actively onboarding deals - anticipate renewal this week.

large-datacap-requests[bot] commented 11 months ago

DataCap Allocation requested

Request number 5

Multisig Notary address

f02049625

Client address

f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka

DataCap allocation requested

1.25PiB

Id

0834346b-e996-4e27-8b79-c293dda180d7

cryptowhizzard commented 11 months ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceblh2rbe3ajbwhrqm7j4vbozincfe4b2sfjowicrkqww4wswqy3hi

Address

f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka

Datacap Allocated

1.25PiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

0834346b-e996-4e27-8b79-c293dda180d7

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceblh2rbe3ajbwhrqm7j4vbozincfe4b2sfjowicrkqww4wswqy3hi

Normalnoise commented 11 months ago

checker:manualTrigger

filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 51.80% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Normalnoise commented 11 months ago

checker:manualTrigger f1qvbe2vppq7jqo3umkl3rnx4uggkxtxi6f7f2zgi f1rylwniokpxpziavwvtvf7qgbj6p23iqgfu26iea f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka f1z6yigcbg6x7c2o4wasp5vya3jzr63jdjqnzvldi

filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report Summary[^1]

Other Addresses[^2]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Normalnoise commented 11 months ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedbmh3tglot64eqxupp2hpsjwkzd2223l6763d6bj2473jjiawjri

Address

f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka

Datacap Allocated

1.25PiB

Signer Address

f1c5non5yf35avgcpsqvxu4yj54yyvxorwyjochqq

Id

0834346b-e996-4e27-8b79-c293dda180d7

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedbmh3tglot64eqxupp2hpsjwkzd2223l6763d6bj2473jjiawjri

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 11 months ago

Still onboarding in this series.

sxxfuture-official commented 10 months ago

checker:manualTrigger f1qvbe2vppq7jqo3umkl3rnx4uggkxtxi6f7f2zgi f1rylwniokpxpziavwvtvf7qgbj6p23iqgfu26iea f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka f1z6yigcbg6x7c2o4wasp5vya3jzr63jdjqnzvldi

filplus-checker-app[bot] commented 10 months ago

DataCap and CID Checker Report Summary[^1]

Other Addresses[^2]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

sxxfuture-official commented 10 months ago

reports looks good

sxxfuture-official commented 10 months ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecrtvmtarfxuv3nvrxu7thts2wlpi5l4u5wms3njkvk4lkc2bojbg

Address

f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka

Datacap Allocated

1.25PiB

Signer Address

f1foiomqlmoshpuxm6aie4xysffqezkjnokgwcecq

Id

0834346b-e996-4e27-8b79-c293dda180d7

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecrtvmtarfxuv3nvrxu7thts2wlpi5l4u5wms3njkvk4lkc2bojbg

cryptowhizzard commented 10 months ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacec3csf4jmezlf26u3ort2w3437h76l7i4dbmm3wwdemkbcnzdmdqm

Address

f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka

Datacap Allocated

1.25PiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

0834346b-e996-4e27-8b79-c293dda180d7

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacec3csf4jmezlf26u3ort2w3437h76l7i4dbmm3wwdemkbcnzdmdqm

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 10 months ago

need to keep open, still onboarding.

Sunnyiscoming commented 10 months ago

Hello, @kernelogic per the https://github.com/filecoin-project/notary-governance/issues/922 for Open, Public Dataset applicants, please complete the following Fil+ registration form to identify yourself as the applicant and also please add the contact information of the SP entities you are working with to store copies of the data.

This information will be reviewed by Fil+ Governance team to confirm validity and then the application will be allowed to move forward for additional notary review.

kernelogic commented 10 months ago

checker:manualTrigger f1qvbe2vppq7jqo3umkl3rnx4uggkxtxi6f7f2zgi f1rylwniokpxpziavwvtvf7qgbj6p23iqgfu26iea f1yvbub3wqjcd2bkayk72ace3fopgxog6ix36l7ka f1z6yigcbg6x7c2o4wasp5vya3jzr63jdjqnzvldi

filplus-checker-app[bot] commented 10 months ago

DataCap and CID Checker Report Summary[^1]

Other Addresses[^2]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

AlanGreaterheat commented 9 months ago

checker:manualTrigger

filplus-checker-app[bot] commented 9 months ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 38.83% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

github-actions[bot] commented 9 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 9 months ago

Keep it open

github-actions[bot] commented 9 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 9 months ago

Keep it open

large-datacap-requests[bot] commented 9 months ago

The issue reached the total datacap requested. This should be closed

github-actions[bot] commented 8 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 8 months ago

keep alive

Sunnyiscoming commented 8 months ago

Please provide ID, City, Country, Organization of each SP here.