filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] BigDataBeast - Storing /dev/urandom #1621

Closed ghost closed 1 year ago

ghost commented 1 year ago

Data Owner Name

Nobody is the owner of this data

Data Owner Country/Region

Holy See (Vatican City State)

Data Owner Industry

Other

Website

https://en.wikipedia.org/wiki//dev/random

Social Media

n/a

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

100TiB

On-chain address for first allocation

f1aawstijtepfw6u25anfk6rh5a6sdkgt5zuxdgwy

Custom multisig

Identifier

No response

Share a brief history of your project and organization

This project is to store parts of /dev/urandom on Filecoin.
We believe that Filecoin and FIL+ is the ideal method of storing this data.

/dev/urandom is completely random and to store 5PiB of this random data on the chain is clearly invaluable.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

We will prepare 33000 files of 17GiB (the ideal size to fill a 32GiB sector) with /dev/urandom
We have asked ChatGPT how to create the files and this was the result

#!/bin/bash

for i in {1..33000}
do
  # Generate a random filename with 8 random characters
  filename=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 8 | head -n 1)

  # Create the file with random data
  dd if=/dev/urandom of="$filename" bs=1G count=17

  echo "Created file $filename"
done

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

singularity

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

Well that is going to be hard, because every time the sample will be different.

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Sporadic

For how long do you plan to keep this dataset stored on Filecoin

Permanently

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, Europe

How will you be distributing your data to storage providers

HTTP or FTP server

How do you plan to choose storage providers

Slack, Big data exchange

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Boost client, Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

Are you storing these files just to get Datacap? It's hard to tell if the data is valid and authorized.

Sunnyiscoming commented 1 year ago

Any update here?