brainhackorg / global2020

Brainhack Global 2020
https://brainhack.org/global2020/
MIT License
13 stars 14 forks source link

Baby Steps Towards De-Sci (Decentralized Science) with Datalad + git annex + IPFS (Interplanetary File Storage System) #95

Open hebbianloop opened 3 years ago

hebbianloop commented 3 years ago

Project info

Title:

Baby Steps Towards a Decentralized Science (De-Sci) with Datalad + git annex + Interplanetary File Storage System and a Dash of Ethereum

image

Project lead: Shady El Damaty - @hebbianloop

Project collaborators: @nkhalsa

Registered Brainhack Global 2020 Event: Brainhack DC

Project Description: A substantial barrier to open science practice is the sharing and accessibility of datasets. Often datasets are stored in a centralized location such as a lab's server or in costly enterprise cloud systems.

There are multiple problems associated with centralized data storage: 1) outages may make data temporarily unavailable, 2) data can disappear forever if the central location suffers failure, 3) centralized data storage enables censorship and can limit accessibility.

The datalad version control software takes steps to address this by including git annex in the back-end to support multiple types of "special remotes" for downloading and publishing datasets. However, there has been no attempt to bridge a decentralized file storage protocol into the datalad suite of supported remotes.

The interplanetary file system allows peer-to-peer sharing of data and storage on distributed networks such as bittorrent, filecoin and cloudflare.

Data storage on these distributed networks also enables tokenization of individual datasets on the Ethereum blockchain and is an important first step for establishing data marketplaces for the peer-to-peer exchange of data and models.

The current project aims to explore the requirements and feasibility of upgrading datalad to support ipfs by including wrapper code for the definition of an ipfs special remote. Once implemented, the project will satisfy requirements for tools needed to automate the tokenization of datasets on the ethereum blockchain.

What we are Doing Including IPFS special remote capability to datalad

For Who? For Decentralized Science!

Why? Centralized data storage is not sustainable in the era of web 3.0

Resources Git Annex IPFS Datalad FAQ IPFS Infura (IPFS API) A tokenized brain

Data to use: Open Neuro

Link to project repository/sources:

Goals for Brainhack Global 2020:

Good first issues:

  1. How does datalad work with special remotes under the hood? Can you set up your own ftp/ssh special remote?
  2. Demonstrate git annex special remote with IPFS.
  3. Add special remote wrapper/plugin to datalad core
  4. Test on multiple machines/environments
  5. Pull request on datalad repository
  6. Create tokenized dataset on ethereum blockchain

Skills: You don't require much background besides familiarity with the terminal and working with the command line in a unix-y environment. We will work together and research how to add the special remote. Familiarity with git highly recommended.

Tools/Software/Methods to Use: git git annex datalad python

Communication channels: https://mattermost.brainhack.org/brainhack/channels/bhg-washingtondc

Project labels

Project Submission

Submission checklist

Once the issue is submitted, please check items in this list as you add under ‘Additional project info’

hebbianloop commented 3 years ago

Hi @Brainhack-Global/project-monitors: my project is ready!

complexbrains commented 3 years ago

Dear @seldamat Thank you very much for submitting your project to the Brainhack Global 2020 🎉 Project seems ready but only missing an image of its own for us to create for your own project-specific card as the other examples here. As soon as we will have the image (you can put the image anywhere in the issue) we will publish it 🚀

Looking forward to hearing from you 🤗

hebbianloop commented 3 years ago

thank you! done!

complexbrains commented 3 years ago

@seldamat your project is published https://brainhack.org/global2020/project/project_95/ and tweeted https://twitter.com/brainhackorg/status/1337770184187240453

Hope you enjoy your participation to Brainhack 🤗