filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] <LaughStorage> - < Sloan Digital Sky Survey> - New #2232

Closed 26dos closed 12 months ago

26dos commented 1 year ago

Data Owner Name

LaughStorage

What is your role related to the dataset

Data Preparer

Data Owner Country/Region

China

Data Owner Industry

Not-for-Profit

Website

https://www.sdss.org/

Social Media

NA

Total amount of DataCap being requested

12PiB

Expected size of single dataset (one copy)

1.4p

Number of replicas to store

8

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1f27fmxepqkflvsh7nvmmsh7sgt6oqv25edfz22a

Data Type of Application

Public, Open Dataset (Research/Non-Profit)

Custom multisig

Identifier

No response

Share a brief history of your project and organization

I joined the filecoin network in 2021, and made the cc package of fil at the very beginning.  In 2022, we did a part of cc to dc conversion, and now we have a planned continuous development of data storage. In 2023, I established a technical service company  <Laughstorage> to make in-depth investment in the distributed storage track.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

The Sloan Digital Sky Survey (SDSS) is one of the most ambitious and influential surveys in the history of astronomy. 
The SDSS project has established in 2000, five periods, 18 times data released. The 19th DR will be in 2024. 
As for now, there are about 750TB volume of data can be accessed.  5.5 million directories, 400 million files. 
DR1-DR7: SDSS-I, 2000-2005; SDSS-II, 2005-2008), it obtained deep, multi-color images covering more than a quarter of the sky and created 3-dimensional maps containing more than 930,000 galaxies and more than 120,000 quasars.
DR8: contains all images from the SDSS telescope - the largest color image of the sky ever made. It also includes measurements for nearly 500 million stars and galaxies, and spectra of nearly two million. 
DR9 contains the first release of BOSS spectroscopy to the public as well as several significant updates to the cumulative SDSS archive
DR10 contains the first release of APOGEE infrared Galactic spectroscopy as well as cumulative updates to the BOSS optical extragalactic spectroscopy archive
The SDSS began regular survey operations in 2000, after a decade of design and construction.  It has progressed through several phases, SDSS-I (2000-2005), SDSS-II (2005-2008), SDSS-III (2008-2014), and SDSS-IV (2014-2020).  Each of these phases has involved multiple surveys with interlocking science goals.  The three surveys that comprise SDSS-IV are eBOSS (including SPIDERS and TDSS), APOGEE-2, and MaNGA (including MaStar),
The SDSS-V Pioneering Panoptic Spectroscopy program started observing in October 2020, and consists of three surveys, known in SDSS-V terminology as mapper programs.

Milky Way Mapper is a multi-object spectroscopic survey to obtain near-infrared and/or optical spectra of more than 4 million stars throughout the Milky Way and Local Group
Local Volume Mapper is an optical, integral-field spectroscopic survey that will target the Milky Way, Small and Large Magellanic Clouds, and other Local Volume galaxies
Black Hole Mapper is a multi-object spectroscopic survey that emphasizes optical spectra (often also with multiple epochs of spectroscopy) for more than 300,000 quasars

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

If you are a data preparer. What is your location (Country/Region)

China

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

After processing the data I download, I will transfer through online.

If you are not preparing the data, who will prepare the data? (Provide name and business)

No response

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

AFAK NO

Please share a sample of the data

http://classic.sdss.org/dr7/
http://sdss3.org/
https://www.sdss4.org/dr17/data_access/volume/
https://dr17.sdss.org/sas/dr17/apogee/spectro/aspcap/
https://dr17.sdss.org/sas/dr17/apogee/spectro/speclib/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, Africa, North America, South America

How will you be distributing your data to storage providers

HTTP or FTP server, IPFS, Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Big Data Exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

f0861589 CN,SD C    
f02223876 HK, FTM Ltd.
f02212669 CN,Cryptomage
f02115125 KR, FiveByte
and more are connecting.
Actually is base on their collateral.

How do you plan to make deals to your storage providers

Boost client, Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

26dos commented 1 year ago

Regarding the previous question: 1, regarding the node disclosure issue, we have re-contacted and assured that the node disclosure will be 100% matching 2、About the retrieval problem, we ensure that all nodes support retrievable

Sunnyiscoming commented 1 year ago
herrehesse commented 1 year ago

@26dos How do you plan on downloading the data? I know people inside SDSS and 1.2PiB will take you over 5 years to download with their bucket speed and they do not accept physical transfers. Can you explain?

26dos commented 1 year ago

@Sunnyiscoming All have been submitted. Please check.

26dos commented 1 year ago

@herrehesse From the data we have prepared so far, the download speed is normal. image image

herrehesse commented 1 year ago

This proofs exactly nothing. Can you proof to me you have SDSS data and what your download speed is? As I stated before I have a connection with the team responsible for the set.

26dos commented 1 year ago

@herrehesse The download speed depends on the downloader's network bandwidth, if you think the project is limiting the speed, please provide evidence.

herrehesse commented 1 year ago

@26dos - do not get me started my friend. Tolerance for abuse is zero. This application is dead in the water.

Sunnyiscoming commented 1 year ago

Have you completed KYC? Email has not been received. Please check it. Contact name and email information of sps has not been all completed in the form. Please resubmit the form with complete information. The Miner IDs and entity name and locaiton should be posted here.

26dos commented 1 year ago

@Sunnyiscoming Hi, I've sent the email again and included our business licence! Okay, I'll resubmit the form.

herrehesse commented 1 year ago

The Miner IDs and entity name and locaiton should be posted here.

26dos commented 1 year ago

@Sunnyiscoming The form has been resubmitted. Thanks.

f02815620 hk BigFive
f01159754 Hunan VITACapital
f02327534 USA ipollo f02320312 USA R1
f02211572 Chendu MicroAnt

herrehesse commented 1 year ago

I will be contacting the businesses and checking if all is correct.

26dos commented 1 year ago

Is there an update here?

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

12PiB

Expected weekly DataCap usage rate

1PiB

Client address

f1f27fmxepqkflvsh7nvmmsh7sgt6oqv25edfz22a

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

26dos commented 1 year ago

@Sunnyiscoming

Is there an update here?

ghost commented 12 months ago

02815620 hk BigFive f01159754 Hunan VITACapital f02327534 USA ipollo f02320312 USA R1 f02211572 Chendu MicroAnt

@26dos please resubmit with SP entity contact info included. Need to confirm separate entities here.

ghost commented 12 months ago

closing until updated

26dos commented 11 months ago

@Sunnyiscoming @Filplus-govteam

Hi, I've updated the information to the list below, plus if there are adjustments to the sp, I will disclose them earlier in the course of subsequent projects:

f02815620 | hk | BigFive | czh9994@gmail.com | Chen f01159754 | Hunan | VITACapital | 15071084991@163.com | Chris f02327534 | USA | ipollo | vito@ipollo.com | vito f02320312 | USA | R1 | 122764916@qq.com | BoQian f02211572 | Chendu | MicroAnt | guokai0916@gmail.com | GuoKai

26dos commented 11 months ago

@Sunnyiscoming @galen-mcandrew @Filplus-govteam Excuse me, are you still working? Is there someone here to follow up?

ghost commented 11 months ago

Please submit SP proof of location to filplus.govteam@gmail.com. This can include business licence or proof of SP address in country listed. @26dos