filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Kernelogic - Human PanGenomics Project (1/4) #1543

Closed kernelogic closed 5 months ago

kernelogic commented 1 year ago

Data Owner Name

Human Pangenome Reference Consortium

Data Owner Country/Region

United States

Data Owner Industry

Life Science / Healthcare

Website

https://humanpangenome.org/

Social Media

https://twitter.com/HumanPangenome

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

Custom multisig

Identifier

No response

Share a brief history of your project and organization

I have participated every Slingshot phase and is probably the best performing as a "small individual client". 

Even though Slingshot v2 has ended, there are still strong demand from SPs to onboard useful data. This application is to onboard open dataset from AWS.

I will provide a web UI (https://singularity-browser.kernelogic.ca/) to index all files onboarded and provide ways to retrieve.

I have successfully completed a few LDNs on other datasets and I have record to show I have been following the rules of decentralization and have zero self dealing.

Some of the recent LDNs I completed:
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1108
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1107
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1106
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1104
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/983

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

Storage working groups, BigD exchange, singularity deal making tool.

Describe the data being stored onto Filecoin

https://github.com/human-pangenomics/hpgp-data

This dataset includes sequencing data, assemblies, and analyses for the offspring of ten parent-offspring trios.

Total size: about 1.2PB from bucket arn:aws:s3:::human-pangenomics

I will apply a total of 20PB DC to store 12 copies (considering car padding)

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

singularity

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/hpgp-data/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Sporadic

For how long do you plan to keep this dataset stored on Filecoin

1 to 1.5 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, Europe, Australia (continent)

How will you be distributing your data to storage providers

HTTP or FTP server

How do you plan to choose storage providers

Slack, Big data exchange

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

PIKNIK f01904630,f01873432
GreaterHeat f01971600,f01992630
HarryM-Filet f02301,f03223,f0240185
BEWELL TECHNOLOGIES LIMITED f01944744,f01943663,f01928097
And many others from BigDExchange

How do you plan to make deals to your storage providers

Singularity

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

100PiB

Client address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

DataCap allocation requested

256TiB

Id

fa687cdf-5a15-467c-85f5-2b3e753586c6

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

There is no previous allocation for this issue.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

NiwanDao commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaced7ii2qhaachinet5mgegzwwetomndptvrptv2zng6oao3uitqtey

Address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

Datacap Allocated

256.00TiB

Signer Address

f1a2lia2cwwekeubwo4nppt4v4vebxs2frozarz3q

Id

fa687cdf-5a15-467c-85f5-2b3e753586c6

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced7ii2qhaachinet5mgegzwwetomndptvrptv2zng6oao3uitqtey

flyworker commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceavia6ame7byexoqwbedd6mta2rcbz5h5ngrptoljtwvuvbkorzhi

Address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

Datacap Allocated

256.00TiB

Signer Address

f1hlubjsdkv4wmsdadihloxgwrz3j3ernf6i3cbpy

Id

fa687cdf-5a15-467c-85f5-2b3e753586c6

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceavia6ame7byexoqwbedd6mta2rcbz5h5ngrptoljtwvuvbkorzhi

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f01858410

Client address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

DataCap allocation requested

512TiB

Id

ba946461-888d-49b8-b3dc-c38c0e361ea4

herrehesse commented 1 year ago

Dear Filecoin+ Github applicant,

We have noticed that the dataset is already (partly) on chain. While we appreciate your enthusiasm to contribute to the Filecoin network, we want to remind you that this behaviour may not be beneficial to the network. Can you explain to me what happend here?

Thank you for your understanding and cooperation.

Screenshot 2023-02-22 at 11 27 40
kernelogic commented 1 year ago

Hi, my explanation is the following:

  1. There is no rule about only one dataset can only be chosen by one person.
  2. This dataset is not "overly crowded"
  3. The only earlier applicant GhostByteInc has not made significant progress on it yet.

Thanks

psh0691 commented 1 year ago

I signed several applications for the Human Genomics project. Can anyone access this public data and apply to DC? I want you to check if this data is the same as the data of other applications. If it is the same data, it is doubtful whether it is necessary to apply separately and save it as FIL+.

kernelogic commented 1 year ago

@psh0691 Yes this dataset is public and anyone can access yes. That's why multiple people have applied to store it. However in FIL+ system there is no rule saying only the first person can use it, and it will not be fair to other people if such rule exist.

liyunzhi-666 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

liyunzhi-666 commented 1 year ago
liyunzhi-666 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedwika5nvqod5ve4cldwtjyxn3hhndsnmb7ekz6gochktqbz63zgc

Address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

Datacap Allocated

512.00TiB

Signer Address

f1pszcrsciyixyuxxukkvtazcokexbn54amf7gvoq

Id

ba946461-888d-49b8-b3dc-c38c0e361ea4

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedwika5nvqod5ve4cldwtjyxn3hhndsnmb7ekz6gochktqbz63zgc

YuanHeHK commented 1 year ago

$./bin/lotus client retrieval-ask f02008883 bafybeifexdxrxm3vgtn67g6dzbrlrd3wfccyunklok4mesevcvdwgk4gfu Ask: f02008883 Unseal price: 0 FIL Price per byte: 0 FIL Payment interval: 1 MiB Payment interval increase: 1 MiB Size: 31.75 GiB Total price for 34091302912 bytes: 0 FIL $./bin/lotus client retrieval-ask f01989014 bafybeihgxmkk4nqxj6smwyglc6sd5rsgqbd52xjs2uk62kgollkiqfodga Ask: f01989014 Unseal price: 0 FIL Price per byte: 0 FIL Payment interval: 1 MiB Payment interval increase: 1 MiB Size: 31.75 GiB Total price for 34091302912 bytes: 0 FIL $./bin/lotus client retrieval-ask f01111113 bafybeid43ifdgtcj7omg3p7mpfy2gxllrsnhksv6vb4xoozulzcu6cukii Ask: f01111113 Unseal price: 0 FIL Price per byte: 0 FIL Payment interval: 1 MiB Payment interval increase: 1 MiB Size: 31.75 GiB Total price for 34091302912 bytes: 0 FIL

I did a couple of random retrieval queries and it looks like the encapsulation requirements conform to the fil+ rules.

YuanHeHK commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacea52bwlaqslyvb5yjf3tndmryc3o3w7fxxkkkwqwpmze4vc7hdlds

Address

f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq

Datacap Allocated

512.00TiB

Signer Address

f1fg6jkxsr3twfnyhdlatmq36xca6sshptscds7xa

Id

ba946461-888d-49b8-b3dc-c38c0e361ea4

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacea52bwlaqslyvb5yjf3tndmryc3o3w7fxxkkkwqwpmze4vc7hdlds

kernelogic commented 1 year ago

keepalive

C00kies77 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

kernelogic commented 11 months ago

Need to keep this open. Still onboarding slowly.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

kernelogic commented 11 months ago

Need to keep this open. Still onboarding slowly.

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 10 months ago

Actively preparing more cars now.

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 10 months ago

Still actively preparing more cars now.

github-actions[bot] commented 9 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 9 months ago

Still actively preparing more cars now.

github-actions[bot] commented 9 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 9 months ago

Still actively preparing more cars now.

github-actions[bot] commented 9 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 9 months ago

Still actively preparing more cars now.

github-actions[bot] commented 8 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 8 months ago

Still actively preparing more cars now.

Sunnyiscoming commented 8 months ago

Hello, @kernelogic per the https://github.com/filecoin-project/notary-governance/issues/922 for Open, Public Dataset applicants, please complete the following Fil+ registration form to identify yourself as the applicant and also please add the contact information of the SP entities you are working with to store copies of the data.

This information will be reviewed by Fil+ Governance team to confirm validity and then the application will be allowed to move forward for additional notary review.

kernelogic commented 8 months ago

Still actively preparing more cars now.

github-actions[bot] commented 8 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 8 months ago

keepalive

github-actions[bot] commented 7 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 7 months ago

keepalive

github-actions[bot] commented 7 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

kernelogic commented 7 months ago

keepalive, holiday season and mainnet upgrade, things going slow

large-datacap-requests[bot] commented 7 months ago

Client address f1v4jt6gssxwf6oafe4muelr3sv6szzoqezwumhtq is present in other Fil+ applications (#1546, #1545, #1544). This may cause unexpected behavior.

Sunnyiscoming commented 7 months ago

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1546#issuecomment-1855885422

github-actions[bot] commented 6 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.