bluesky / tiled

API to structured data
https://blueskyproject.io/tiled
BSD 3-Clause "New" or "Revised" License
59 stars 50 forks source link

Gracefully handle situations where image serializers are applied to incompatible arrays #113

Closed dylanmcreynolds closed 2 years ago

dylanmcreynolds commented 3 years ago

I have the following configuration.

Config.yml:

# config.yml
trees:

  - path: "/"
    tree: files
    args:
      directory: "/home/dylan/data/beamlines/sequence_test"
      subdirectory_handler: tiled.readers.tiff_sequence:subdirectory_handler

uvicorn:
  host: 0.0.0.0
  port: 8000

And the contents of the sequence_test are: . ├── sequence │   └── 20211001_092045_toothpick_00055.tiff └── test.txt

When I navigate to http://0.0.0.0:8000/metadata/sequence I get good output, including the "full link" http://0.0.0.0:8000/array/full/sequence

However, following that full link without slice parameters, the errors (returns 500) crashes with the following error:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
...trimmed...
  File "/home/dylan/miniconda3/envs/tiled/lib/python3.9/site-packages/fastapi/routing.py", line 201, in app
    raw_response = await run_endpoint_function(
  File "/home/dylan/miniconda3/envs/tiled/lib/python3.9/site-packages/fastapi/routing.py", line 150, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/dylan/miniconda3/envs/tiled/lib/python3.9/site-packages/starlette/concurrency.py", line 34, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/home/dylan/miniconda3/envs/tiled/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "./tiled/server/router.py", line 325, in array_full
    array = reader.read(slice)
  File "./tiled/readers/tiff_sequence.py", line 84, in read
    image_axis, *the_rest = slice

Interestingly, I get the a similar result with the same error with hdf5 arrays. It really seems like slice is a required parameter. Should the router code check that slice has been sent and if not there, respond with a 400 rather than falling down into code a coming back with a 500 internal error?

danielballan commented 3 years ago

[@dylanmcreynolds knows this but for anyone else who reads this:] The issue here is that 3D data is being requested in a format (text/html) that does not (yet) have a 3D representation in tiled. For text/html we display CSV via numpy.savetxt, and that only works for less than 3 dimensions.

In #114 @jmaruland improved the error message. It now gives a 406 response code and the message

The shape of this data (...) is incompatible with the requested format (text/html). Slice it or choose a different format.

In addition to CSV, there are several other formats we support, such as PNG, that have limitations on the dimension. Let's leave this issue open as a reminder to address them all.

danielballan commented 3 years ago

Running the example server

tiled serve pyobject --public tiled.examples.generated:tree

we can see the new behavior

$ http ':8000/array/full/tiny_cube' Accept:text/csv
HTTP/1.1 406 Not Acceptable
content-length: 141
content-type: application/json
date: Mon, 25 Oct 2021 13:26:38 GMT
server: uvicorn
server-timing: acl;dur=0.0, read;dur=0.6, tok;dur=0.1, app;dur=6.4
set-cookie: tiled_csrf=pANTecTV3BwQARss_31TYq5QSQ1yTdgFUM_3yTMvDFE; HttpOnly; Path=/; SameSite=lax

{
    "detail": "The shape of this data (10, 10, 10) is incompatible with the requested format (text/csv). Slice it or choose a different format."
}

$ http ':8000/array/full/tiny_cube?slice=0' Accept:text/csv
HTTP/1.1 200 OK
content-length: 1930
content-type: text/csv; charset=utf-8
date: Mon, 25 Oct 2021 13:26:44 GMT
etag: c2dccbf741e1f49d0f4cac8a8297401d
server: uvicorn
server-timing: acl;dur=0.0, read;dur=0.9, tok;dur=0.1, pack;dur=0.3, app;dur=7.1
set-cookie: tiled_csrf=0NN2rgZMOtjgYTUwdJ-VLPHCqfgauii_d_j_InJkBeY; HttpOnly; Path=/; SameSite=lax

0.7622942807205353,0.6701698832340867,0.3608414206219056,0.6310531914527799,0.2950678524239797,0.479207558673324,0.26552106379350804,0.5730501421610666,0.6425715682399353,0.7411114894769981
0.4889356125748505,0.7135697742333985,0.6693267067445278,0.6387737206920371,0.12128741615524075,0.507330777959637,0.15299229032081985,0.24551064801926537,0.07422427791707487,0.900362836152078
<snipped>

So far, so good! But the image formats will have a similar problem to CSV. For example:

$ http ':8000/array/full/tiny_cube' Accept:image/png
HTTP/1.1 500 Internal Server Error
content-length: 21
content-type: text/plain; charset=utf-8
date: Mon, 25 Oct 2021 13:32:33 GMT
server: uvicorn

Internal Server Error

The image encoders are here, just below the CSV one.

https://github.com/bluesky/tiled/blob/302b22d31b5765bd387d7445dc925172749bd1ca/tiled/structures/array.py#L184-L236

For CSV, the rule is simple: we know that anything above 2D will fail. For images, it's more subtle. 3D or 4D can work depending on the details of the format and whether color is involved. For that reason, in this case we might want to "beg forgiveness rather than ask permission". That is:

try:
    # existing code
except (TypeError, ValueError):
    raise UnsupportedShape(...)

This leaves it up to the image exporter to figure out whether the image can be handled or not. The tiny_cube and tiny_hypercube examples will be good for developing this.

dylanmcreynolds commented 3 years ago

What a about adding something like a "ArrayShapeIncompatible" exception class as a way to make it obvious to readers that they can raise it and the router code can handle in a regular fashion?

danielballan commented 3 years ago

In #29 we added handling of two such exceptions:

https://github.com/bluesky/tiled/blob/302b22d31b5765bd387d7445dc925172749bd1ca/tiled/server/core.py#L414-L422

Could the names be improved? "Incompatible" might be better than "Unsupported".

dylanmcreynolds commented 3 years ago

Oh, cool. I wouldn't bother changing the name.

danielballan commented 2 years ago

Closed by #154