psychoinformatics-de / studyforrest-data

DataLad superdataset of all studyforrest.org project dataset components
https://studyforrest.org
8 stars 2 forks source link

Create *public* RIA store #38

Closed mih closed 3 years ago

mih commented 3 years ago

To be deposited at https://datapub.fz-juelich.de/studyforrest/studyforrest.ria eventually.

This will make all (sub)datasets flexibly accessible, without the need to put everything on github.

The chosen name is a bit clunky, but despite the top-level studyforrest directory, there is other data on there too. It makes sense to me to have a dedicated RIA store for just studyforrest, though.

mih commented 3 years ago

I created the target directory.

mih commented 3 years ago

I decided that it would be best to sync the private and the public store with rsync, instead of going through a series of checkouts. Access to the internal store can be managed more flexibly:

The call excludes a bunch of datasets that cannot be made public by datalad ID. The chmod is needed to set the appropriate permissions for the webserver.

rsync \
  -av \
  --delete-excluded \
  --exclude 'd5d/d3da0-a631-4c0c-a4a9-de55dfc4620f*' \
  --exclude '607/5c0fa-ab72-4bab-9888-3b597f0e63b1*' \
  --exclude 'd47/59300-5563-467d-be5f-e5b164fb3060*' \
  --exclude 'ad9/b6c66-4413-4b4f-b6da-b7f25d0d6397*' \
  --exclude '4c5/36c4a-ec61-11e6-9440-00b56d060aa7*' \
  --exclude '126/cd950-377c-4600-a921-045cf408bd9f*' \
  --exclude 'da1/5d84c-9c8b-11e9-a3fb-f0d5bf7b5561*' \
  --exclude 'c08/af312-e05b-43b3-b499-db0d2ad46bf6*' \
  --chmod=Do+rX,ug+w,Fo+r \
  . \
  <host>:/mnt/inm7/studyforrest.ria

ssh <host> \
  'find /mnt/inm7/studyforrest.ria -mindepth 2 -name ria-layout-version -execdir git update-server-info \;'
mih commented 3 years ago

Sadly, we cannot use http://datalad.studyforrest.org yet, due to https://github.com/datalad/datalad/issues/5616

mih commented 3 years ago

Ok, so

datalad clone ria+https://datapub.fz-juelich.de/studyforrest/studyforrest.ria#~super

should give all public datasets, and a get for their content should also work.