filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Galaxy Evolution Explorer Public Dataset #1572

Closed RoelandHC closed 7 months ago

RoelandHC commented 1 year ago

Data Owner Name

NASA

Data Owner Country/Region

United States

Data Owner Industry

Environment

Website

https://www.jpl.nasa.gov/missions/galaxy-evolution-explorer-galex

Social Media

https://www.jpl.nasa.gov/missions/galaxy-evolution-explorer-galex

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1px4gq7csxy5dmsphrsfaimdj5b3ekfplr6aqthi

Custom multisig

Identifier

No response

Share a brief history of your project and organization

JPL is a research and development lab federally funded by NASA and managed by Caltech.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

The Galaxy Evolution Explorer Satellite (GALEX) was a NASA mission led by the California Institute of Technology, whose primary goal was to investigates how star formation in galaxies evolved from the early Universe up to the present. GALEX used microchannel plate detectors to obtain direct images in the near-UV (NUV) and far-UV (FUV), and a grism to disperse light for low resolution spectroscopy. More information about GALEX is available at MAST.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

lotus

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/galex/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

2 years

In which geographies do you plan on making storage deals

Asia other than Greater China, North America, South America, Europe

How will you be distributing your data to storage providers

HTTP or FTP server, Shipping hard drives

How do you plan to choose storage providers

Slack, Big data exchange

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

cryptowhizzard commented 1 year ago

Dear applicant,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

Looking at your application i have some questions: As you are brand new on Github and have no history of past applications it seems to me that applying for 5PB of datacap is a lot. One needs comprehensive knowledge of Filecoin, packing of data, distribution of data and all it's requirements coming with it. Are you brand new in the Filecoin space or have you applied for datacap in the past on different Github account names?

Can you show us visible proof of the size of your data and the storage systems you have there?

As last question i would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

Sunnyiscoming commented 1 year ago

Please make a detailed description of your organization, Sps you will cooperate.

RoelandHC commented 1 year ago

We want to use the filecoin network to store useful and valuable data and we'd like to use distributed storage to achieve secure data saving. Now we do not know the exact providers in advance.

herrehesse commented 1 year ago

@RoelandHC can you answer, in detail, the questions of @cryptowhizzard above?

Sunnyiscoming commented 1 year ago

Can you provide more informations about your organization?

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f1px4gq7csxy5dmsphrsfaimdj5b3ekfplr6aqthi

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1px4gq7csxy5dmsphrsfaimdj5b3ekfplr6aqthi

DataCap allocation requested

250TiB

Id

1a3ca18c-3d96-40f6-8b3a-8ac2a4f11787

RoelandHC commented 1 year ago

@Sunnyiscoming This data is from JPL (The Jet Propulsion Laboratory) who holds a unique place in the universe. We found that they are a leader in robotic space exploration, sending rovers to Mars, probes into the farthest reaches of the solar system, and satellites to advance understanding of our home planet.

cryptowhizzard commented 1 year ago

Again:

Dear applicant,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

Looking at your application i have some questions: As you are brand new on Github and have no history of past applications it seems to me that applying for 5PB of datacap is a lot. One needs comprehensive knowledge of Filecoin, packing of data, distribution of data and all it's requirements coming with it. Are you brand new in the Filecoin space or have you applied for datacap in the past on different Github account names?

Can you show us visible proof of the size of your data and the storage systems you have there?

As last question i would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

RoelandHC commented 1 year ago

@cryptowhizzard Thank you for your attention to our application. I have read your form. There is a lot of information here that I don't think needs to be disclosed. I think that it will put my privacy at risk. If I need your help, I will contact you here. Thanks

lvschouwen commented 1 year ago

Hi @RoelandHC

Thank you for this LDN request. I have been checking the AWS bucket size and its 32.8TiB big.

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/galex --recursive --human-readable --summarize
Total Objects: 15723397
   Total Size: 32.8 TiB

You are requesting 5 PiB of datacap. So you are planning to store 155 copies of this dataset on Filecoin?

Looking forward to a storage plan of this application!

TrueWarriors8 commented 1 year ago

@swatchliu @metawaveinfo

你能簽嗎?我們已經付給你很多錢,想開始工作。

RoelandHC commented 1 year ago

Hello @lvschouwen

The total size of this dataset is 591.1 TiB

You can retrieve that data with the following command according to AWS official source:

aws s3 ls --summarize --human-readable --recursive s3://bucket-name/

RoelandHC commented 1 year ago

@swatchliu @MetaWaveInfo

Can you sign it? We have paid you a lot of money and want to start working.

@TrueWarriors8 Who are you?

lvschouwen commented 1 year ago

Hello @lvschouwen

The total size of this dataset is 591.1 TiB

You can retrieve that data with the following command according to AWS official source:

aws s3 ls --summarize --human-readable --recursive s3://bucket-name/

That is exactly what I did ... Have you seen the command used in https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1572#issuecomment-1440130828 ?

Checking https://registry.opendata.aws/galex/ is mentions the bucket name is s3://stpubdata/galex/ Galex is a sub directory of the bigger bucket "stpubdata" which indeed is much larger but holds more then only Galex.

$ ls -la /stpubdata
total 32
drwxrwxrwx  2 root root 4096 Dec 14 20:54 .
drwxr-xr-x 49 root root 4096 Feb 14 09:34 ..
drwxrwxrwx  2 root root 4096 Dec 14 20:54 galex
drwxrwxrwx  2 root root 4096 Dec 14 20:54 hst
drwxrwxrwx  2 root root 4096 Dec 14 20:54 k2
drwxrwxrwx  2 root root 4096 Dec 14 20:54 kepler
drwxrwxrwx  2 root root 4096 Dec 14 20:54 panstarrs
drwxrwxrwx  2 root root 4096 Dec 14 20:54 tess

So please tell us; what do you want to store in this LDN request? Because only Galex is 32.8 TiB and not 591 TiB as you claim.

RoelandHC commented 1 year ago

@lvschouwen Hi, we are glad to solve your doubts. But can you tell me which organization are you from?

I have noticed that our dataset is the same as cryptowhizzard's (Wijnand Schouten). He also agreed to the size of this dataset is 591.1TiB.

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/487#issuecomment-1191783881 =)

His applications: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/487 https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1482 https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1555

Have you checked it?

cryptowhizzard commented 1 year ago

Hi @RoelandHC

Happy to help. Things change is life as well as on AWS :) The size of this set has changed om AWS.

I verified, @lvschouwen is right here.

lvschouwen commented 1 year ago

Sorry my reply took a little time. Those aws commands (see later in my post) take some time to run.

@lvschouwen Hi, we are glad to solve your doubts. But can you tell me which organization are you from?

I had no doubts, I just want to make sure that your application matches the data provided. I do not represent a organisation, I am a FIL+ Community Member asking annoying questions 😸

I have noticed that our dataset is the same as cryptowhizzard's (Wijnand Schouten). He also agreed to the size of this dataset is 591.1TiB.

These are the sizes for the different sub directories under s3://stpubdata

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/galex --recursive --human-readable --summarize
Total Objects: 15723397
   Total Size: 32.8 TiB

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/hst --recursive --human-readable --summarize
Total Objects: 34546005
   Total Size: 320.0 TiB

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/k2 --recursive --human-readable --summarize
Total Objects: 1282924
   Total Size: 4.1 TiB

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/kepler --recursive --human-readable --summarize
Total Objects: 6217672
   Total Size: 17.2 TiB

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/panstarr --recursive --human-readable --summarize
Total Objects: 1
   Total Size: 0 Bytes

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://stpubdata/tess --recursive --human-readable --summarize
Total Objects: 8908645
   Total Size: 362.4 TiB

All the 6 datasets combined (so the complete bucket size) is 736.5 TiB

Its totally fine if you want to store all of s3://stpubdata, but then your LDN is not for "Galaxy Evolution Explorer" alone but for the following open datasets;

https://registry.opendata.aws/galex/ (Galaxy Evolution Explorer) https://registry.opendata.aws/hst/ (Hubble Space Telescope) https://registry.opendata.aws/kepler/ (Kepler Mission Data) https://registry.opendata.aws/k2/ (K2 Mission Data) https://registry.opendata.aws/tess/ (Transiting Exoplanet Survey Satellite)

And my guess is, that despite the naming of LDNs #487 , #1482 and #1555 that all the datasets above are included in those LDNs, but I cannot speak for someone else. (I see @cryptowhizzard already replied)

So I think you've got 2 options moving this LDN forwards;

  1. Rename it so it includes all the 5 datasets mentioned above, that way the request is correct in my opinion. Do note that there is a request open on the governance github in regards to storing aggregate datasets (https://github.com/filecoin-project/notary-governance/issues/832).

  2. Stick with s3://stpubdata/galex which is 32.8TiB and not 736.5 TiB (or 519 TiB as you claim), and hence a 5PiB request would be way way to much.

It does disappoints me @RoelandHC that you quote @cryptowhizzard in the expected size and that you did not do this work yourself.

RoelandHC commented 1 year ago

It does disappoints me @RoelandHC that you quote @cryptowhizzard in the expected size and that you did not do this work yourself.

@lvschouwen Why do you understand it that way? I'm very sad about this. I think this discourages me from participating in filecoin to store useful data, but I will insist on participating and store more data to prove myself. I've checked my dataset and its size was 591.1TiB at that time. The reason I quote Wijnand Schouten's words is that I think he is a famous and active notary in our community, and he can be a strong support for me under this situation---we both store the same, real dataset. I'm just doing the same thing as him, as everyone who is active in the Filecoin community.

cryptowhizzard commented 1 year ago

It does disappoints me @RoelandHC that you quote @cryptowhizzard in the expected size and that you did not do this work yourself.

@lvschouwen Why do you understand it that way? I'm very sad about this. I think this discourages me from participating in filecoin to store useful data, but I will insist on participating and store more data to prove myself. I've checked my dataset and its size was 591.1TiB at that time. The reason I quote Wijnand Schouten's words is that I think he is a famous and active notary in our community, and he can be a strong support for me under this situation---we both store the same, real dataset. I'm just doing the same thing as him, as everyone who is active in the Filecoin community.

I know I'm handsome, smart, good looking and irresistible to most people, but can you at least be a little sporty. If you want you can just edit your request and take the whole /stpubdata directory with the latest data etc? Then it is also of value for the network.

Leave me out of it, you can do it on your own.

RoelandHC commented 1 year ago

Hi @RoelandHC

Happy to help. Things change is life as well as on AWS :) The size of this set has changed om AWS.

I verified, @lvschouwen is right here.

@cryptowhizzard Thanks for your help! My handsome, smart, good looking, irresistible and respectable notary. ;)

lvschouwen commented 1 year ago

This clearly is a troll application. I've put a lot of effort in helping a troll once again.

RoelandHC commented 1 year ago

@lvschouwen Hey, I think there's some misunderstanding here. I really appreciate everything you did. I just want to clarify your misunderstanding. This is a peaceful community, isn't it?

1475Notary commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecvterwvxbsnk7yp3hldbpapxkg3xws5dtdtt4pch3op6lcqumhxo

Address

f1px4gq7csxy5dmsphrsfaimdj5b3ekfplr6aqthi

Datacap Allocated

250.00TiB

Signer Address

f1ofq4mngy7ggcp755pfquq2gphjjnlydolf6awtq

Id

1a3ca18c-3d96-40f6-8b3a-8ac2a4f11787

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecvterwvxbsnk7yp3hldbpapxkg3xws5dtdtt4pch3op6lcqumhxo

lvschouwen commented 1 year ago

@RoelandHC (cc @dkkapur) As mention on Slack by @dkkapur my questions in regards "What are you going to store under this LDN" are fair and reasonable as the provided information in your initial request does not match your answers and the data available.

I was hoping that by now the questions I've asked would have been answered but instead the first stap on granting DC to this application without proper due diligence has been made by @1475Notary.

I hoped to be proven wrong but now the plot thickens.

1475Notary commented 1 year ago

We have had a full conversation with the applicant just now. This is a trustworthy applicant, so we are willing to support him.

lvschouwen commented 1 year ago

We have had a full conversation with the applicant just now. This is a trustworthy applicant, so we are willing to support him.

Good to hear that you had a full conversation with @RoelandHC. I assume the question of "What are you going to store" has been asked, and I am wondering what their reply was.

RoelandHC commented 1 year ago

@lvschouwen, which question I did not respond to? I am an openbook and i welcome all peaceful community members with any concerns. Not sure why you keep calling me TROLL, it doesn't look like in line with the conduct code. @raghavrmadya I feel offended by this and i would like to get your opinion.

RoelandHC commented 1 year ago

I had no doubts, I just want to make sure that your application matches the data provided. I do not represent a organisation, I am a FIL+ Community Member asking annoying questions 😸

Btw does Wijnand this name ring a bell to you? Are you sure that you don't work for Dcent or Speedium? Just FYI no Pope should lie :)

lvschouwen commented 1 year ago

@lvschouwen, which question I did not respond to? I am an openbook and i welcome all peaceful community members with any concerns. Not sure why you keep calling me TROLL, it doesn't look like in line with the conduct code. @raghavrmadya I feel offended by this and i would like to get your opinion.

I indeed called you a troll, once, and @dkkapur said clearly on Slack that I should be more respectful. So "keep calling me TROLL", is incorrect and a false accusation.

He also confirmed that my questions regarding this LDN are fair and valide. Why haven't you replied to my questions?

lvschouwen commented 1 year ago

I had no doubts, I just want to make sure that your application matches the data provided. I do not represent a organisation, I am a FIL+ Community Member asking annoying questions 😸

Btw does Wijnand this name ring a bell to you? Are you sure that you don't work for Dcent or Speedium? Just FYI no Pope should lie :)

Yes, it does ring a bell. Anyone involved in the community knows him. I am pretty sure I do not work for Dcent of Speedium.

RoelandHC commented 1 year ago

This clearly is a troll application. I've put a lot of effort in helping a troll once again.

Twice, my friend and I still haven't heard anything close to sorry from you so far.

He also confirmed that my questions regarding this LDN are fair and valide. Why haven't you replied to my questions?

Wrong. I replied, check, period.

It does disappoints me @RoelandHC that you quote @cryptowhizzard in the expected size and that you did not do this work yourself.

@lvschouwen Why do you understand it that way? I'm very sad about this. I think this discourages me from participating in filecoin to store useful data, but I will insist on participating and store more data to prove myself. I've checked my dataset and its size was 591.1TiB at that time. The reason I quote Wijnand Schouten's words is that I think he is a famous and active notary in our community, and he can be a strong support for me under this situation---we both store the same, real dataset. I'm just doing the same thing as him, as everyone who is active in the Filecoin community.

RoelandHC commented 1 year ago

I had no doubts, I just want to make sure that your application matches the data provided. I do not represent a organisation, I am a FIL+ Community Member asking annoying questions 😸

Btw does Wijnand this name ring a bell to you? Are you sure that you don't work for Dcent or Speedium? Just FYI no Pope should lie :)

Yes, it does ring a bell. Anyone involved in the community knows him. I am pretty sure I do not work for Dcent of Speedium.

If that's the case, why are you everywhere but just not in Dcent's applications? I'm quite amazed by how many similarities you two have, in pairs etc.

lvschouwen commented 1 year ago

He also confirmed that my questions regarding this LDN are fair and valide. Why haven't you replied to my questions?

Wrong. I replied, check, period.

It does disappoints me @RoelandHC that you quote @cryptowhizzard in the expected size and that you did not do this work yourself.

@lvschouwen Why do you understand it that way? I'm very sad about this. I think this discourages me from participating in filecoin to store useful data, but I will insist on participating and store more data to prove myself. I've checked my dataset and its size was 591.1TiB at that time. The reason I quote Wijnand Schouten's words is that I think he is a famous and active notary in our community, and he can be a strong support for me under this situation---we both store the same, real dataset. I'm just doing the same thing as him, as everyone who is active in the Filecoin community.

So from the 2 options I gave you its going to be option 1. This LDN is for the following open datasets;

https://registry.opendata.aws/galex/ (Galaxy Evolution Explorer) https://registry.opendata.aws/hst/ (Hubble Space Telescope) https://registry.opendata.aws/kepler/ (Kepler Mission Data) https://registry.opendata.aws/k2/ (K2 Mission Data) https://registry.opendata.aws/tess/ (Transiting Exoplanet Survey Satellite)

Total size +- 736.5 TiB.

Would you be so kind to edit your original application and change the following things; Change "[DataCap Application] Galaxy Evolution Explorer Public Dataset" to "[DataCap Application] Space Telescope Science Institute Datasets"

Then change the wording left and right and under the sample of data you can use the list above.

I think that makes this LDN more complete and correct and then I hope a second notary will sign the request soon so you can start onboarding.

Destore2023 commented 1 year ago

1、Good to see both of you can take a step back, and willing to give you some time to correct. 2、It is very necessary to accurately describe the content of the dataset. We support the relative applicants to update the content in the application ASAP. @RoelandHC @cryptowhizzard 3、No insulting words (such as "TROLL") can be allowed to use at any time. This is the minimum requirement for all community members who make comments. Otherwise, they will be kicked out of the community immediately. @lvschouwen @raghavrmadya https://github.com/filecoin-project/notary-governance/discussions/830

Destore2023 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacec64g7tm2i6ey2kn47iowtr3buwwxh2yvczg3a4skuvuwiprkyu74

Address

f1px4gq7csxy5dmsphrsfaimdj5b3ekfplr6aqthi

Datacap Allocated

250.00TiB

Signer Address

f1yh6q3nmsg7i2sys7f7dexcuajgoweudcqj2chfi

Id

1a3ca18c-3d96-40f6-8b3a-8ac2a4f11787

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacec64g7tm2i6ey2kn47iowtr3buwwxh2yvczg3a4skuvuwiprkyu74

cryptowhizzard commented 1 year ago

I don’t know why you are tagging me and why you think this is ok and why you think you can get away with this, but what i do know is that the damage you guys are doing ( refusing to do KYC and allowing scams to continue with it for benefit ) is on your shoulders and on that of 1475notary.

lvschouwen commented 1 year ago

I can't wait for the CID report of this application.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

RoelandHC commented 11 months ago

keep it open

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

clriesco commented 10 months ago

Removed stale label and reopened issue :)

github-actions[bot] commented 10 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

RoelandHC commented 10 months ago

keep it open

github-actions[bot] commented 9 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

RoelandHC commented 9 months ago

keep it open