dandi / dandi-archive

DANDI API server and Web app
https://dandiarchive.org
13 stars 12 forks source link

"Open in browser" on dataset_description.json results in Download request #2027

Closed yarikoptic closed 3 weeks ago

yarikoptic commented 2 months ago

image

on https://dandiarchive.org/dandiset/000108/draft/files?location= clicking on the .json file which could be easily visualized by browser leads to download.

Chromium    128.0.6613.113 (Official Build) built on Debian GNU/Linux trixie/sid (64-bit) 
Revision    9597ae93a15d4d03089b4e9997b1072228baa9ad-refs/branch-heads/6613@{#1429}
OS  Linux
JavaScript  V8 12.8.374.24
User Agent  Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36
Command Line    /usr/lib/chromium/chromium --show-component-extension-options --enable-gpu-rasterization --no-default-browser-check --disable-pings --media-router=0 --enable-remote-extensions --load-extension --flag-switches-begin --disable-quic --flag-switches-end
Executable Path /usr/lib/chromium/chromium
Profile Path    /home/yoh/.config/chromium/Default
Variations Seed Type    Null

References:

waxlamp commented 2 months ago

I am not sure there's anything for us to do here. For me:

In other words, this seems like something the user would have to configure (through plugins and settings) in order to add browser-level handling for JSON files.

The quantum leap would be for us to create our own in-DANDI JSON viewer but that seems very out of scope. Let me know what you think.

yarikoptic commented 2 months ago

TL;DR: it is due to encodingFormat we (dandi-cli or user via API) uploaded in metadata for the assets. In dandi-cli we rely now on https://docs.python.org/3/library/mimetypes.html to guess and provide mime type within encodingFormat. In newer dandisets all is good for those .json files: e.g. going to https://dandiarchive.org/dandiset/000874/draft/files?location= I can view those jsons file and in metadata record encodingFormat says json. So the question now on what to do to fixup metadata records for other jsons already in the archive? e.g. could a simple script be written which goes and fixups all metadata records where path points to .json file and has encodingFormat that application/octet-stream?

or could/should we provide in-code fixup... faster to be done than said:

some exploration so in Brave it prompts for download instead of just showing the file. Ideally it would just open/show that .json. I don't know why it doesn't -- that is something to investigate and possibly address... may be it is because of redirected to URL does not say that it is context type json but rather ``` Location: https://dandiarchive.s3.amazonaws.com/blobs/c07/71a/c0771a4f-3483-47e7-821e-b28ac8df46a5?response-content-disposition=inline%3B%20filename%3D%22dataset_description.json%22&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAUBRWC5GAEKH3223E%2F20240920%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240920T151948Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=b2a5377b279d44b376207462754867badcb790a9e6ddd6985adb608d104e78a0 [following] --2024-09-20 11:19:48-- https://dandiarchive.s3.amazonaws.com/blobs/c07/71a/c0771a4f-3483-47e7-821e-b28ac8df46a5?response-content-disposition=inline%3B%20filename%3D%22dataset_description.json%22&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAUBRWC5GAEKH3223E%2F20240920%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240920T151948Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=b2a5377b279d44b376207462754867badcb790a9e6ddd6985adb608d104e78a0 Resolving dandiarchive.s3.amazonaws.com (dandiarchive.s3.amazonaws.com)... 52.219.179.20, 52.219.93.204, 3.5.132.189, ... Connecting to dandiarchive.s3.amazonaws.com (dandiarchive.s3.amazonaws.com)|52.219.179.20|:443... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK x-amz-id-2: CC5dV7aOm+GVdN1jadRHEU902hZzJNlY/g0vBCPDbUtY87z67qZAcGT0A256KnMzNfvaBR13z1o= x-amz-request-id: QCXQ2C47GBHMBAZV Date: Fri, 20 Sep 2024 15:19:49 GMT Last-Modified: Tue, 01 Jun 2021 18:15:20 GMT ETag: "f4a034fbf965f76828fa027c29860bc0-1" x-amz-version-id: nOp0gS1O6evD3TfRPAr5IV2Htf.voloo Content-Disposition: inline; filename="dataset_description.json" Accept-Ranges: bytes Content-Type: application/octet-stream Server: AmazonS3 Content-Length: 71 Length: 71 [application/octet-stream] ``` I felt we had related issue and the closest I found was - https://github.com/dandi/dandi-archive/issues/1870 and apparently that is the `encodingType` we provide in that json metadata if we look at it : https://api.dandiarchive.org/api/dandisets/000108/versions/draft/assets/2847011b-f9fb-4933-a6b5-1641e0c1886b/
waxlamp commented 1 month ago

So the question now on what to do to fixup metadata records for other jsons already in the archive? e.g. could a simple script be written which goes and fixups all metadata records where path points to .json file and has encodingFormat that application/octet-stream?

I think this is the way. Let me look into it.

waxlamp commented 3 weeks ago

@yarikoptic I think this is now fixed (the example in the issue description now properly renders the JSON content in my browser).

I'll close this issue but please re-open if there are lingering issues.