datalad / datalad-paper-joss

Repository for JOSS paper on DataLad
MIT License
3 stars 28 forks source link

Add section/paragraph on design principles #65

Closed mih closed 3 years ago

mih commented 3 years ago

In technical talks I tend to include the following list of design principles for DataLad:

  1. There are only two recognized entities: datasets and files
  2. A dataset is a Git repository with an optional annex
  3. Minimization of custom procedures and data structures: Users must not loose data or data access, if DataLad would vanish
  4. Complete decentralization, no required central server or service.

I believe in their simplicity the can be instrumental in communicating the underlying mindset. Some aspects are already included in the text, but it still makes sense to me to simply present them in this refined form -- possibly right at the start of Overview of DataLad

yarikoptic commented 3 years ago

Overall - good idea! Some of those are already present though in "Why Git and git-annex alone are not enough", so may be in the light of #64 discussion, added features of datalad could then migrate into this section, but that might lead to loosing the benefit from these items clarity. With that in mind (since mentioned in "Why"), thinking about "relationship" in the perspective title, I think it might be worth expanding here somewhere with "Relationships between datasets established via Git submodules mechanism"?