Marshall-btc / Marshall-Fil-Data-Pathway

1 stars 0 forks source link

[DataCap Application] <Large Sky Area Multi-Object Fiber Spectroscopic Telescope-5> #5

Open jacktree7 opened 5 months ago

jacktree7 commented 5 months ago

Data Owner Name

National Astronomical Observatories, Chinese Academy of Sciences

Data Owner Country/Region

China

Data Owner Industry

Environment

Website

http://dr5.lamost.org/

Social Media Handle

http://dr5.lamost.org/

Social Media Type

Other

What is your role related to the dataset

Data Preparer

Total amount of DataCap being requested

15PiB

Expected size of single dataset (one copy)

3000TiB

Number of replicas to store

6

Weekly allocation of DataCap requested

1000TiB

On-chain address for first allocation

f1ojdxbixro3pxlozhcra5tzxe5p6df6rzwrmlaja

Data Type of Application

Public, Open Dataset (Research/Non-Profit)

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Experienced Personal Data Provider.The Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) is a Chinese national scientific research facility operated by the National Astronomical Observatories, Chinese Academy of Sciences. It is a special reflecting Schmidt telescope with 4000 fibers in a field of view of 20 deg2
 in the sky. Until July 2017, LAMOST has completed its pilot survey which was launched in October 2011 and ended in June 2012, and the first five years of regular survey which was initiated on September 2012. After this six-year-survey, we totally obtain 9,026,365 spectra, which consist of stars, galaxies, quasars and other unknown objects[1−7]
. Now, the fifth data release (DR5) has published online (http://dr5.lamost.org/), and released data products include:

Spectra. - IIn general, there are 9,026,365 flux- and wavelength-calibrated, sky-subtracted spectra in DR5, including 8,183,160 stars, 152,863 galaxies, 52,453 quasars, and 637,889 unknown objects, and these spectra cover the wavelength range of 3690-9100 angstrom with a resolution of 1800[2−3]
 at the 5500 angstrom.
Spectroscopic Parameters Catalogs. - In this data release, six spectroscopic parameters catalogs are also published,and they are the LAMOST general catalog, the A, F, G and K type star catalog, the A type star catalog, the M dwarf catalog, the observed plate information catalog, and the input catalog respectively. In the LAMOST general catalog, it includes 36 columns of basic spectroscopic information, for example, right ascension, declination, signal to noise ratio, magnitude, classification and redshift, which are also provided by the A, F, G and K type star catalog, the A type star catalog, and the M dwarf catalog. These three catalogs also provide other parameters, for example, atmospheric parameters (effective temperature, surface gravity, and metallicity), spectral line indices, line widths, the metallicity sensitive parameter, and the magnetic activity flag. In addition, the observed plate information catalog mainly contains nine basic plate information for all published plates, and the input catalog includes 24 basic fields mentioned above and three new fields which are not included in above catalogs.

http://dr5.lamost.org/v3/doc/data-production-description

Guoshoujing Telescope (the Large Sky Area Multi-Object Fiber Spectroscopic Telescope LAMOST) is a National Major Scientific Project built by the Chinese Academy of Sciences. Funding for the project has been provided by the National Development and Reform Commission. LAMOST is operated and managed by the National Astronomical Observatories, Chinese Academy of Sciences.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

The Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) is a Chinese national scientific research facility operated by the National Astronomical Observatories, Chinese Academy of Sciences. It is a special reflecting Schmidt telescope with 4000 fibers in a field of view of 20 deg2
 in the sky. Until July 2017, LAMOST has completed its pilot survey which was launched in October 2011 and ended in June 2012, and the first five years of regular survey which was initiated on September 2012. After this six-year-survey, we totally obtain 9,026,365 spectra, which consist of stars, galaxies, quasars and other unknown objects[1−7]
. Now, the fifth data release (DR5) has published online (http://dr5.lamost.org/), and released data products include:

Spectra. - IIn general, there are 9,026,365 flux- and wavelength-calibrated, sky-subtracted spectra in DR5, including 8,183,160 stars, 152,863 galaxies, 52,453 quasars, and 637,889 unknown objects, and these spectra cover the wavelength range of 3690-9100 angstrom with a resolution of 1800[2−3]
 at the 5500 angstrom.
Spectroscopic Parameters Catalogs. - In this data release, six spectroscopic parameters catalogs are also published,and they are the LAMOST general catalog, the A, F, G and K type star catalog, the A type star catalog, the M dwarf catalog, the observed plate information catalog, and the input catalog respectively. In the LAMOST general catalog, it includes 36 columns of basic spectroscopic information, for example, right ascension, declination, signal to noise ratio, magnitude, classification and redshift, which are also provided by the A, F, G and K type star catalog, the A type star catalog, and the M dwarf catalog. These three catalogs also provide other parameters, for example, atmospheric parameters (effective temperature, surface gravity, and metallicity), spectral line indices, line widths, the metallicity sensitive parameter, and the magnetic activity flag. In addition, the observed plate information catalog mainly contains nine basic plate information for all published plates, and the input catalog includes 24 basic fields mentioned above and three new fields which are not included in above catalogs.

http://dr5.lamost.org/v3/doc/data-production-description

Guoshoujing Telescope (the Large Sky Area Multi-Object Fiber Spectroscopic Telescope LAMOST) is a National Major Scientific Project built by the Chinese Academy of Sciences. Funding for the project has been provided by the National Development and Reform Commission. LAMOST is operated and managed by the National Astronomical Observatories, Chinese Academy of Sciences.

Where was the data currently stored in this dataset sourced from

Other

If you answered "Other" in the previous question, enter the details here

http://dr5.lamost.org/

If you are a data preparer. What is your location (Country/Region)

None

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

No response

If you are not preparing the data, who will prepare the data? (Provide name and business)

No response

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

No response

Please share a sample of the data

http://dr5.lamost.org/v3/sas/catalog/
http://dr5.lamost.org/v3/sas/fits/20111024/F5902/
http://dr5.lamost.org/v3/sas/fits/20111024/F5907/
http://dr5.lamost.org/v3/sas/fits/20111024/F5909/
http://dr5.lamost.org/v3/sas/png/20111024/F5902/
http://dr5.lamost.org/v3/sas/png/20111024/F5907/
http://dr5.lamost.org/v3/sas/png/20111024/F5909/
http://dr5.lamost.org/v3/sas/sky/20111024/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

2 to 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, Africa, North America, South America, Europe, Australia (continent), Antarctica

How will you be distributing your data to storage providers

HTTP or FTP server, Shipping hard drives

How did you find your storage providers

Slack, Partners

If you answered "Others" in the previous question, what is the tool or platform you used

No response

Please list the provider IDs and location of the storage providers you will be working with.

f03000070
f03091739
f03028310 
f03028315 
f03028318 
f03028326 
f03028321

How do you plan to make deals to your storage providers

Boost client, Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

datacap-bot[bot] commented 5 months ago

Application is waiting for allocator review

Marshall-btc commented 5 months ago

Hey @jacktree7 , thanks for applying to Marshall-Fil-Data-Pathway. In addition to the questions in the issue, I would still like you to confirm the following. 1.Have you prepared enough token for sector pledge? 2.Are you a data preparer? What is your previous experience as a data-preparer? List previous applications and client IDs 3.How will the data be prepared? Please include tooling used and technical details 4.If you are not preparing the data, who will prepare the data? (Name and Business) 5.Has this dataset been stored on Filecoin before? If so, why are you choosing to store it again? 6.Best practice for storing large datasets includes ideally, storing it in 3 or more regions, with 4 or more storage provider operators or owners.You should list Miner ID, Business Entity, Location of sps you will cooperate with.

Marshall-btc commented 5 months ago

Other than that, can you send an email using the official email address to verify authenticity? marshallbtc1990@gmail.com The content should contain the name of your application.

jacktree7 commented 5 months ago

Hello, I am glad to receive your patient review. Currently the SPs I communicated with are ready to pledge 90% of their coins. I am a data preparer, I have participated in the early slingshot, mainly through the official tools, such as boost, lotus. the data stored so far is the first time for the SPs I work with. The information of SPs being communicated is as follows: f03000070 Honglian Tec China; f03091739 lanxin USA; f03028310 Fly Hk; f03028315 Datastone Guangdong; f03028318 Seven Singapore; f03028326 HKblockchain US; f03028321 Datastone HK.

Hey @jacktree7 , thanks for applying to Marshall-Fil-Data-Pathway. In addition to the questions in the issue, I would still like you to confirm the following. 1.Have you prepared enough token for sector pledge? 2.Are you a data preparer? What is your previous experience as a data-preparer? List previous applications and client IDs 3.How will the data be prepared? Please include tooling used and technical details 4.If you are not preparing the data, who will prepare the data? (Name and Business) 5.Has this dataset been stored on Filecoin before? If so, why are you choosing to store it again? 6.Best practice for storing large datasets includes ideally, storing it in 3 or more regions, with 4 or more storage provider operators or owners.You should list Miner ID, Business Entity, Location of sps you will cooperate with.

jacktree7 commented 5 months ago

@Marshall-btc Verification email sent.

image
jacktree7 commented 5 months ago

Passport information is also attached to the email, please check,thanks!

Marshall-btc commented 5 months ago
image

I have received a KYC email from this client.

Before the first round begins, I still want to remind you @jacktree7 :

You and SP are to strictly adhere to the rules of Fil+ and Marshall-Fil-Data-Pathway. Please confirm SP's geographic location and name before the start of each allocation round. If there is a change, please comment promptly in that application. Please keep the SP retrievable.

datacap-bot[bot] commented 5 months ago

Datacap Request Trigger

Total DataCap requested

15PiB

Expected weekly DataCap usage rate

1000TiB

DataCap Amount - First Tranche

500TiB

Client address

f1ojdxbixro3pxlozhcra5tzxe5p6df6rzwrmlaja

datacap-bot[bot] commented 5 months ago

DataCap Allocation requested

Multisig Notary address

Client address

f1ojdxbixro3pxlozhcra5tzxe5p6df6rzwrmlaja

DataCap allocation requested

500TiB

Id

e28dfcf7-0aac-45e4-850a-8e0d38c00843

datacap-bot[bot] commented 5 months ago

Application is ready to sign

datacap-bot[bot] commented 5 months ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedapupgwh6rbqqcrgumsfguh7mybodrtsx4yldpw2sayqszwfscmc

Address

f1ojdxbixro3pxlozhcra5tzxe5p6df6rzwrmlaja

Datacap Allocated

500TiB

Signer Address

f16b6a4s63opnunpag3llqg77pfl4pyixwb657iza

Id

e28dfcf7-0aac-45e4-850a-8e0d38c00843

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedapupgwh6rbqqcrgumsfguh7mybodrtsx4yldpw2sayqszwfscmc

datacap-bot[bot] commented 5 months ago

Application is Granted

jacktree7 commented 4 months ago

Synchronize the latest information in a timely manner: f01315096、f03091739 、f03144037、f03055005、f01106668、f03055018、f03055029、f0870558.

datacap-bot[bot] commented 3 months ago

Client used 75% of the allocated DataCap. Consider allocating next tranche.

Marshall-btc commented 3 months ago

checker:manualTrigger

datacap-bot[bot] commented 3 months ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

⚠️ 50.00% of Storage Providers have retrieval success rate equal to zero.

⚠️ The average retrieval success rate is 12.01%

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report.

Marshall-btc commented 3 months ago

image Hey @jacktree7.I see that you have completed this round of encapsulation and have a few questions for you .

  1. Why the two SPs f03156617 and f03148356 were not disclosed in the previous application?
  2. f03055005, f01315096, f01106668 have a low success rate in retrieval, how do you explain this situation?
jacktree7 commented 3 months ago

1.Sorry, these two SPs were not updated in time. 3 weeks ago, the original SPs were delayed in the data storage process due to a temporary problem with filecoin pledged coins, so I temporarily found a partner to recommend these two new SPs, and I will continue to update them later. f03148356 Linyun Japan, f03156617 Tongkun Sydney. 2.f03055005, f01106668, f01315096 these SPs, I have contacted and communicated with them, they told me that they are studying the principle of Spark, and recently happened to run into the main network upgrade on August 6, it will be improved later. Please stay tuned.

Marshall-btc commented 3 months ago

Ok, I'll support this round, but still want you to contact SPs to improve retrieval success.

datacap-bot[bot] commented 3 months ago

Application is in Refill

datacap-bot[bot] commented 3 months ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebqnhcsmmrjexguznm2zb7gazg2wmk7otmrf735a35th7l27sb7r4

Address

f1ojdxbixro3pxlozhcra5tzxe5p6df6rzwrmlaja

Datacap Allocated

1PiB

Signer Address

f16b6a4s63opnunpag3llqg77pfl4pyixwb657iza

Id

5f2aaab7-3a21-4d9c-a70d-3c0024e49404

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebqnhcsmmrjexguznm2zb7gazg2wmk7otmrf735a35th7l27sb7r4

datacap-bot[bot] commented 3 months ago

Application is Granted

datacap-bot[bot] commented 3 months ago

Client used 75% of the allocated DataCap. Consider allocating next tranche.

Marshall-btc commented 2 months ago

checker:manualTrigger

datacap-bot[bot] commented 2 months ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

⚠️ 57.14% of Storage Providers have retrieval success rate equal to zero.

⚠️ 100.00% of Storage Providers have retrieval success rate less than 75%.

⚠️ The average retrieval success rate is 13.53%

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report.

jacktree7 commented 2 months ago

Timely update of information:f01518369、f01889668、f03151449、f03151456、f03179555、f03179570、f03188440、f03190614、f03190616.

Marshall-btc commented 1 month ago

checker:manualTrigger

datacap-bot[bot] commented 1 month ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

⚠️ 18.18% of Storage Providers have retrieval success rate equal to zero.

⚠️ 54.55% of Storage Providers have retrieval success rate less than 75%.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report.

Marshall-btc commented 1 month ago

Hey @jacktree7 ,I see that overall Spark retrieval is good in your latest report, but please explain these 5 node Spark retrieval anomalies. image

jacktree7 commented 1 month ago

Thank you for the allocator operator's patient response. I have communicated with these clients, and I’m providing the information below:

  1. The two nodes in the United States experienced network outages due to data center failures starting from October 1. During this period, the clients worked hard to restore the network. The screenshot shows the first penalty record 24 hours after the network went down at 3 AM on October 1. It wasn't until today that the client restored a standard 1G local carrier bandwidth, so normal network operations have just resumed, and the Spark retrieval rate will gradually improve in the future. image image

  2. For the other three nodes, the clients reported that they are using the DDO mode. Currently, Spark and DDO are not fully compatible. Below is a screenshot of the DDO mode. We will try to minimize the use of the DDO ordering mode in the future. img_v3_02f4_887fb674-27d1-4c98-8ef4-e055144ae15h img_v3_02f4_1e948c2c-3e04-42d0-ac99-4becbbc873bh

Please review this information and continue to support 2PiB. Thank you very much!

Marshall-btc commented 1 month ago

Looks good, but for Allocator compliance operations purposes, I will be approving this round of Datacap cautiously, with limited intervention on the new batch quota, approving 2PiBs as per the rules, but this time only 1PiB at first.I will resume 2PiB quota approvals after subsequent improvements are seen.

datacap-bot[bot] commented 1 month ago

Application is in Refill

datacap-bot[bot] commented 1 month ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaced7dppspujgfyknpwoowhafces6z7loh7vsizzgjvpbqpwo4irkh6

Address

f1ojdxbixro3pxlozhcra5tzxe5p6df6rzwrmlaja

Datacap Allocated

1PiB

Signer Address

f16b6a4s63opnunpag3llqg77pfl4pyixwb657iza

Id

1aa815d3-3ea2-417e-9d58-651d14028945

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced7dppspujgfyknpwoowhafces6z7loh7vsizzgjvpbqpwo4irkh6

datacap-bot[bot] commented 1 month ago

Application is Granted

datacap-bot[bot] commented 1 week ago

Client used 75% of the allocated DataCap. Consider allocating next tranche.

filecoin-watchdog commented 3 days ago

checker:manualTrigger

datacap-bot[bot] commented 3 days ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

⚠️ 11.76% of Storage Providers have retrieval success rate equal to zero.

⚠️ 70.59% of Storage Providers have retrieval success rate less than 75%.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report.

Marshall-btc commented 2 days ago

@jacktree7 Please explain “The Client declared 6 replicas, while 8 of them already exist.” https://github.com/filecoin-project/Allocator-Governance/issues/216#issuecomment-2486404278 image

jacktree7 commented 2 days ago

Hi @Marshall-btc 6 copies is the minimum, so some data will have more backups in order to have more secure redundancy for the data. Also, considering that some sp declared sector period is only 180 days, had to find more sp for storage in order to meet my data storage deadline.