fractal-analytics-platform / fractal-vizarr-viewer

Prototype to explore serving/viewing zarr data
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Are we supporting `#` in paths? #33

Closed tcompa closed 4 weeks ago

tcompa commented 1 month ago

If the folder name on disk is /some/authorized/base/folder/my_#file.zarr, it looks like the file.zarr part is lost somewhere along the process (be it in the URL parsing, or somewhere else). TBD

tcompa commented 1 month ago

The fractal-web-viewer logs include entries like

INFO - File not found: /some/authorized/base/folder/my_/.zattrs
jluethi commented 1 month ago

Ah, good catch. It would be great if we could support this, because the Yokogawa microscope often defaults to having this in plate names. If it turns out to be hard to support, I suggest we include a normalization step for this in how we name Zarr plates in the Cellvoyager converter

zonia3000 commented 1 month ago

In URL the # identifies the URI fragment, a component that is accessible only client side and is never sent to the server.

To support this we need to URL encode twice the fragment when we generate the URL in the View button in fractal-web:

# --> %2523

%23 is the URL encode for # and %25 is the URL encode for %. In this way the URL is built using %23.

I supported the URL encode parsing in the fractal-vizarr-viewer backend to support this and similar needs: b05c596144627088c001c2dd90bc6853594a428e

However I did some tests and I noticed a strange behavior that seems to be caused by some logic in vizarr. The image is loaded, but when one performs some actions (like zooming or refreshing the page) the %23 is converted to a # again, losing the correct reference to the image.

Let me know if you want that I investigate a bit deeper on vizarr side, to potentially open a PR or issue upstream, or if you prefer to avoid using this symbol in paths.

jluethi commented 1 month ago

Hey @zonia3000 If you have a chance to look into this a little further, that would be great. My initial assumption is that this can't be needed in that many places. It's needed for the original URL and vizarr builds the urls for the different chunks for the resolutions when you zoom. So maybe there is a not too complex fix to cover the second part of resolution pyramids as well. And would be good to test whether it works on plates as well.

Independent of the answer, we want to reduce our reliance on this, as it can also become an issue with other viewers: https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/843 So we'll at least drop it for our current main converter. But other converters could still have such issues of course (or the special character could be somewhere else in the file path)

zonia3000 commented 1 month ago

I had a look at the vizarr logic regarding the management of query parameters and I noticed that it removes the URL encoding from the parameters.

I did some changes to preserve the URL encoding in our hash-in-url branch:

https://github.com/fractal-analytics-platform/vizarr/commit/aed714898dfe48ac3bc6c770a346ca8af8d1f126

Together with https://github.com/fractal-analytics-platform/fractal-web/pull/572 this should allow opening the paths containing #.

Notice that this makes the URL much more longer and difficult to read, so I'm not sure if the vizarr maintainers will be happy with this change.

For example, the following URL:

http://localhost:3000/vizarr/?source=http://localhost:3000/vizarr/data/home/xonya/code/fractal-vizarr-viewer/testing_data/test%2523hash.zarr/

Is now transformed in this way when opened by modified vizarr:

http://localhost:3000/vizarr/?source=http%3A%2F%2Flocalhost%3A3000%2Fvizarr%2Fdata%2Fhome%2Fxonya%2Fcode%2Ffractal-vizarr-viewer%2Ftesting_data%2Ftest%2523hash.zarr%2F&viewState={%22zoom%22:1,%22target%22:[162.5,137.5]}
tcompa commented 1 month ago

Thanks Sonia!

To recap, we have two choices here:

  1. We keep vizarr as is, i.e. not supporting # in URLs, and we mitigate this issue in fractal-tasks-core (by sanitizing plate names). On top of that, we can add checks or warnings for this in fractal-web when producing vizarr URLs which would not work.
  2. We submit a PR to vizarr based on https://github.com/hms-dbmi/vizarr/compare/main...fractal-analytics-platform:vizarr:hash-in-url. This would mean that vizarr fully supports # in file paths, with the downside of making them less readable (see last comment).

Feedback on this is appreciated - cc @jluethi

jluethi commented 1 month ago

Ouf, that's significantly less readable! Thanks a lot for checking this @zonia3000 !

I can see why we wouldn't want to do this. I'd say this is strong enough reason to handle this on the task side & add warnings on fractal-web when zarr-urls with special characters are provided. Thus, while I would have liked the feature, I say we go with option 1 of not supporting # in zarr_urls for the vizarr viewer

tcompa commented 4 weeks ago

The current status is that we are not pursuing this, due to how less readable URLs would become. If we ever need to come back to it, we'll have good understanding of where to start from.