Closed yarikoptic closed 4 years ago
for any file id
, this API provides the URL:
https://girder.dandiarchive.org/api/v1/file/{id}/download
https://girder.dandiarchive.org/api/v1/file/5dab084bf377535c7d96c2c4/download?contentDisposition=attachment
however, this does not provide the direct url to the s3 bucket
@mgrauer - does this mean we are paying for egress from the dandi archive if we call this API?
in that case, could we add the url
as metadata to the item that contains the file?
That is also a decision to make - either we would like to be this middle service (Which is great for telemetry, possible resilience, etc; but also the culprit) or for public files provide the end point URL (directly to s3) -- that would remove us as a middle man. I wondered if it may be redirects, but it seems to be not public (requires authentication):
$> wget -S 'https://girder.dandiarchive.org/api/v1/file/5dab084bf377535c7d96c2c4/download?contentDisposition=attachment'
--2019-11-02 10:13:53-- https://girder.dandiarchive.org/api/v1/file/5dab084bf377535c7d96c2c4/download?contentDisposition=attachment
Resolving girder.dandiarchive.org (girder.dandiarchive.org)... 3.19.164.171
Connecting to girder.dandiarchive.org (girder.dandiarchive.org)|3.19.164.171|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 401 Unauthorized
Server: nginx/1.14.0 (Ubuntu)
Date: Sat, 02 Nov 2019 14:13:54 GMT
Content-Type: application/json
Content-Length: 100
Connection: keep-alive
Allow: DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT
Girder-Request-Uid: 99cf33a4-5574-43d3-b457-6d23d1da0995
Username/Password Authentication Failed.
Username/Password Authentication Failed.
yarik: all uploaded data are private at the moment. i don't know if that was a conscious decision or just by default. you could turn yours public to test if it is a redirect.
but yes, it would be good to be a redirect and provide the url (if available) via an api.
Hm, I will check on dandi client side but I thought I stated that it should be public... Checked and see no public anywhere in https://github.com/dandi/dandi-cli/blob/master/dandi/cli/command.py so I guess it was up to default which is private. I will fix for that https://github.com/dandi/dandi-cli/issues/31
This is resolved. The Girder and publish API provide redirects to S3 from File and Asset download endpoints.
Our main asset store is on S3. Local asset store could be on a file system. To pass a file into external tool (notebook or datalad) we need to discover its location. In case of private files (not current target, but worth keeping them in mind) I guess dandi archive should mint some short lived URL, which in the case of S3 could be done for us by S3 itself. In case of public files, it could be direct url in the bucket (ideally with versionId) or URI to the asset store + path to the file in the assetstore + ideally versionId from a versioned bucket (but that is s3 specific). So probably we just need a simple API to "get_uri_for_file"
Could that be easily done or may be already available within girder-client @mgrauer ?