sina-mansour / UKB-connectomics

This repository will host scripts used to map structural and functional brain connectivity matrices for the UK biobank dataset.
https://www.biorxiv.org/content/10.1101/2023.03.10.532036v1
62 stars 7 forks source link

UKB storage to upload processed bulk data #11

Closed sina-mansour closed 1 year ago

sina-mansour commented 2 years ago

Following on the suggestion by @Lestropie (this commit):

RS: As per discussion, need to find out how much data can be uploaded per subject to UKB (and indeed what volume of data could potentially be hosted elsewhere). Any temporaries that are not to be later hosted anywhere are better off being stored on a RAM file system. My typical approach here is to load all input data into a scratch directory that I can force to be in /tmp/, store all intermediate files and final outputs there, and only upon script completion do I then write the desired derivatives to the location requested by the user. I then only retain the scratch directory if the user explicitly requests that it be retained. Your structure here checks for the pre-existence of calculated files, which is useful when you are testing perturbations to the script, but for final deployment this ability is not as high a priority.

sina-mansour commented 2 years ago

We're currently storing all intermediary files on the scratch file system. The following processed data are stored/will be stored to be shared with the public:

We'll need to ensure that we can somehow upload three sets of bulk data for every individual back to the UKB storage:

  1. atlases
  2. functional time-series
  3. structural connectivity measures

@caioseguin would you be able to enquire from UKbiobank to see if they will accept that and whether there are certain limits that we need to adhere to?

caioseguin commented 2 years ago

I will ask them and get back to you.

sina-mansour commented 1 year ago

This issue has been left dormant for a while. The last update is that we were able to return the results to the UK biobank over a secure sFTP connection (using MediaFlux).

UKB has informed us that the resource should be made available in a new release (planned for November 2023).