Share a brief history of your project and organization
FileDrive Datasets Landing Plan is a project for onboarding more valuable public datasets onto the Filecoin network. Through several phases, we plan to bring 10 PiB data and promote 100 PiB storage power growth to Filecoin.
About FileDrive Datasets
FileDrive Datasets is a platform to effectively connect the huge storage market that Filecoin has built with publishers of public datasets.
The Filecoin network provides reliable, secure, and affordable decentralized storage services, and FileDrive Labs wants to deliver these benefits to end-users by building a public dataset platform.
It is challenging to attract traditional Cloud Storage and Object-base Storage users to the Filecoin network and benefit from it. Developers in the Felicoin ecosystem, such as FileDrive Labs, need to face this challenge together.
As a member of the Filecoin ecosystem, FileDrive Labs has been insisting on developing useful tools to make it easier for users to store their data onto the Filecoin network.
FileDrive Datasets has integrated a group of tools to provide storage service with the compatibility of both Cloud Storage and Object-base Storage and better user experience to attract more users.
Projects(ongoing) behind:
- Go-Graphsplit: https://github.com/filedrive-team/go-graphsplit
- DS-Cluster: https://github.com/filedrive-team/go-ds-cluster
- Filejoy: https://github.com/filedrive-team/filejoy
Article about FileDrive Datasets on Filecoin Blog:
- Large Datasets: FileDrive: https://filecoin.io/blog/posts/large-datasets-filedrive/
About FileDrive Labs
FileDrive Labs has always defined ourselves as tool developers and infrastructure builders in the Filecoin ecosystem. From 2019, we continuously focus on technical solutions and development based on IPFS protocol and the Filecoin network and do our best to contribute to the community.
Over 80% of our team are qualified engineers, and half of them have more than 10-year development experience in multiple industries, including Communication, the Internet, and blockchain.
Since 2020, we have participated in Slingshot Competition, become one of the top teams, and stored over 5 PiB useful data from public datasets to the Filecoin network.
To contribute to the Filecoin Community, we developed an open-source data prep tool Graphsplit, FIL+ project dashboard filplus.info and storage provider discovery platform filfind,info.
Besides, we have also hold weekly online virtual events named FileDrive Meetup from March 2022, which aims to provide a platform for community members to grasp the latest trends of the Filecoin network and our work and research.
Please check the following links for more details.
- GitHub: https://github.com/filedrive-team
- Twitter: https://twitter.com/FileDrive1
- Eventbrite: https://www.eventbrite.hk/o/filedrive-labs-42456337463
- YouTube Channel: https://www.youtube.com/channel/UCxcZC1dtBUlQvZY7DX13W1w
- Medium: https://medium.com/@FileDrive1
Is this project associated with other projects/ecosystem stakeholders?
No
If answered yes, what are the other projects/ecosystem stakeholders
No response
Describe the data being stored onto Filecoin
Smithsonian Open Access
- The Smithsonian’s mission is the "increase and diffusion of knowledge" and has been collecting since 1846. The Smithsonian, through its efforts to digitize its multidisciplinary collections, has created millions of digital assets and related metadata describing the collection objects. On February 25th, 2020, the Smithsonian released over 2.8 million CC0 interdisciplinary 2-D and 3-D images, related metadata, and additionally, research data from researches across the Smithsonian. The 2.8 million "open access" collections are a subset of the Smithsonian’s 155 million objects, 2.1 million library volumes and 156,000 cubic feet of archival collections held in 19 museums, 9 research centers, libraries, archives and the National Zoo. Digitization of collections is ongoing.
- https://registry.opendata.aws/smithsonian-open-access/
- License: CC0
- Size: 621.2 TiB
Where was the data currently stored in this dataset sourced from
My Own Storage Infra
If you answered "Other" in the previous question, enter the details here
No response
How do you plan to prepare the dataset
IPFS, lotus, graphsplit
If you answered "other/custom tool" in the previous question, enter the details here
No response
Please share a sample of the data
Original Source:
https://registry.opendata.aws/smithsonian-open-access/
Confirm that this is a public dataset that can be retrieved by anyone on the Network
[X] I confirm
If you chose not to confirm, what was the reason
No response
What is the expected retrieval frequency for this data
Weekly
For how long do you plan to keep this dataset stored on Filecoin
2 to 3 years
In which geographies do you plan on making storage deals
Greater China, Asia other than Greater China, North America, Europe, Australia (continent)
How will you be distributing your data to storage providers
IPFS, Shipping hard drives, Lotus built-in data transfer
How do you plan to choose storage providers
Slack, Filmine
If you answered "Others" in the previous question, what is the tool or platform you plan to use
No response
If you already have a list of storage providers to work with, fill out their names and provider IDs below
Please check the Checker Reports of our previous LDN applications:
- https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1266
How do you plan to make deals to your storage providers
Lotus client
If you answered "Others/custom tool" in the previous question, enter the details here
No response
Can you confirm that you will follow the Fil+ guideline
Data Owner Name
FileDrive Labs
Data Owner Country/Region
China
Data Owner Industry
Life Science / Healthcare
Website
https://filedrive.io/
Social Media
Total amount of DataCap being requested
5PiB
Weekly allocation of DataCap requested
500TiB
On-chain address for first allocation
f1udumyw3yjzxuu5co4rateaq6czubrwbyy2t4jiq
Custom multisig
Identifier
No response
Share a brief history of your project and organization
Is this project associated with other projects/ecosystem stakeholders?
No
If answered yes, what are the other projects/ecosystem stakeholders
No response
Describe the data being stored onto Filecoin
Where was the data currently stored in this dataset sourced from
My Own Storage Infra
If you answered "Other" in the previous question, enter the details here
No response
How do you plan to prepare the dataset
IPFS, lotus, graphsplit
If you answered "other/custom tool" in the previous question, enter the details here
No response
Please share a sample of the data
Confirm that this is a public dataset that can be retrieved by anyone on the Network
If you chose not to confirm, what was the reason
No response
What is the expected retrieval frequency for this data
Weekly
For how long do you plan to keep this dataset stored on Filecoin
2 to 3 years
In which geographies do you plan on making storage deals
Greater China, Asia other than Greater China, North America, Europe, Australia (continent)
How will you be distributing your data to storage providers
IPFS, Shipping hard drives, Lotus built-in data transfer
How do you plan to choose storage providers
Slack, Filmine
If you answered "Others" in the previous question, what is the tool or platform you plan to use
No response
If you already have a list of storage providers to work with, fill out their names and provider IDs below
How do you plan to make deals to your storage providers
Lotus client
If you answered "Others/custom tool" in the previous question, enter the details here
No response
Can you confirm that you will follow the Fil+ guideline
Yes