datalad / datalad-xnat

Track XNAT projects with DataLad
Other
3 stars 9 forks source link

Support downloads of ZIPed scans #80

Open mih opened 2 years ago

mih commented 2 years ago

.../files?format=zip

@mslw please elaborate

mslw commented 2 years ago

Currently, our calls to XNAT REST API produce lists of files (eg. for a given "experiment" i.e. session) which are then obtained through datalad download_url. It might be interesting to consider requesting zipped files instead. Whether to do that or not is not straightforward, so here's a summary:

The XNAT API (see How To Download Files via the XNAT REST API in its documentation) provides a way to request a zip file for a given call, for example:

/data/projects/TEST/subjects/1/experiments/MR1/scans/1/files?format=zip

The ?format=zip works on an api call used in platform.py to get files that is:

'data/experiments/{experiment}/scans/{scan}/files?format=zip.

I tried: curl --output scan3.zip <local xnat instance>/data/experiments/<experiment ID>/scans/3/files?format=zip' (this is a phantom scan with 256 dicoms) Iand got the zip as expected. The shasum is the same when I do it again after a couple of minutes. However, it took 14 seconds the first time I tried it, and 5 seconds the second time, which suggests there might be caching going on on the server side (but who knows).

The desired granularity for getting dicoms is likely to be on a level of a single scan (sequence), or single experiment (session). On the one hand, the zip solution might provide an advantage for transfer speed when there's a lot of files to be obtained. On the other hand, the speed advantage won't be there if scans use enhanced dicoms (if we are getting one scan per request, the compression will be an overhead). Anyway, downloading zips would require jumping through additional hoops later, by using datalad add-archive-content to unpack the zip files without losing their provenance.