bluesky / databroker

Unified API pulling data from multiple sources
https://blueskyproject.io/databroker
BSD 3-Clause "New" or "Revised" License
35 stars 47 forks source link

Reading configuration of devices in a run doesn't work with tiled server #820

Open Bilchreis opened 2 months ago

Bilchreis commented 2 months ago

This issue seems somewhat similar to #745

Expected Behavior

With databroker backed by intake I am able to access the configuration parameters of devices like this:

db = databroker.catalog['xyz']
run = db[-1]
run.primary.config['det'].read()

This gives an xarray of device configuration parameters. I would like the same functionality using a databroker with a tiled server like this:

from tiled.client import from_uri
db = from_uri("http://localhost:8000/api")
run = db[-1]
run.primary.config['det'].read()

Current Behavior

I get the following error when I try to access the config data:

HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:8000/api/v1/metadata/df8df11a-28f6-4e43-af52-e9c5722d271a/primary/config/det'
For more information, server admin can search server logs for correlation ID None.

Possible Solution

Steps to Reproduce (for bugs)

from bluesky import RunEngine

from bluesky.plans import scan
from ophyd.sim import det,motor

from tiled.client import from_uri
## Set up env

RE = RunEngine({})
db = from_uri("http://localhost:8000",api_key="secret")

def post_document(name,doc):
    db.post_document(name, doc)

RE.subscribe(post_document)

RE(scan([det],motor,1,2,10))

db[-1].primary.config['motor'].read() # this somehow returns config data

db[-1].primary.config['det'].read() # this fails

Context

I want to use the data inside the configuration parameters to populate a Nexus File . Similarly to the Issue #745 the data is still accessible under:

run.primary.data.metadata['descriptors'][0]['configuration']['det']['data']

Your Environment

databroker==2.0.0b46
tiled==0.1.0b7
Bilchreis commented 2 months ago

So we narrowed it down a little, I had a mismatch between the datatype reported in dtype and the actual value that was returned by some signal i hat. the same thing happens with det and its noise EnumSignal .

noisy_det.noise.describe()
--> {'noisy_det_noise': {'source': 'SIM:noisy_det_noise',
  'dtype': 'integer',
  'shape': [],
  'enum_strs': ('none', 'poisson', 'uniform')}}

det.noise.read()
--> {'noisy_det_noise': {'value': 'uniform', 'timestamp': 1727074877.150907}}

And if there is a single mismatch, run.primary.config is not getting populated. (same thing happens if there is a single mismatch in the read signals). So this explains this behaviour:

db[-1].primary.config['motor'].read() # this returns config data

db[-1].primary.config['det'].read() # this fails

Further investigation:

But after fixing the mismatch in my code, reading run.primary.config still failed. We narrowed it down to a structured numpy arrays. So if any signal contains a strutured numpy array, reading from tield fails.

Example:

So if a Signal similar to this one is present reading the config data from tiled fails.

await gas_dosing.massflow_contr1.status.describe()
--> {'gas_dosing-massflow_contr1-status': {
  'source': 'localhost:10800:gas_dosing:massflow_contr1:status',
  'dtype_str': '|V408',
  'dtype_descr': "[('f0', '<i8'), ('f1', '<U100')]",
  'dtype': 'array',
  'shape': []}
}

await gas_dosing.massflow_contr1.status.read()
--> {'gas_dosing-massflow_contr1-status': {'value': array((100, 'at target'), dtype=[('f0', '<i8'), ('f1', '<U100')]),
  'timestamp': 1727076275.2028334}}

Note: When I use the temp() database everything is stored properly, and accessible.

danielballan commented 2 months ago

Thanks @Bilchreis. It may be next week until we properly follow up, as several of us are at a conference this week.

Two immediate thoughts:

Bilchreis commented 2 weeks ago

@danielballan So i think I found the issue: numpy.dtype() expects either just an array-protocol type string, or a list of tuples of names and type strings.
Something like this:

[("name", "U10"), ("age", "i4"), ("weight", "f4")]

When I put "dtype_descr":[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')] in the datakey of a signal, it shows up as "dtype_descr": [["name", "<U10"], ["age", "<i4"], ["weight", "<f4"]] in the document stored in tiled.

leading to this type error:

File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 80, in _try_descr
    dtype = StructDtype.from_numpy_dtype(numpy.dtype(descr))                                     
TypeError: Field elements must be 2- or 3-tuples, got '['name', '<U10']'

I guess the dtype_descr list gets serialized to JSON for transport and then all tuples are turned into arrays.

danielballan commented 2 weeks ago

Thanks @Bilchreis! Would you give this branch a try and check that it fully addresses the issue? If so we can add a unit test.

Bilchreis commented 2 weeks ago

Hmm, now I get this shape mismatch, when accessing the data:

  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 565, in read_block
    return self._dataset_adapter.read_block(self._field, block, slice=slice)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 765, in read_block
    raw_array = self.get_columns([variable], slices=slices)[variable]
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 856, in get_columns
    to_stack = self._inner_get_columns(tuple(keys), min_seq_num, max_seq_num)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 991, in _inner_get_columns
    populate_columns((key,), min_, max_)
  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 933, in populate_columns
    validated_column = list(
                       ^^^^^
  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 935, in <lambda>
    lambda item: self.validate_shape(
                 ^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/databroker/mongo_normalized.py", line 2227, in default_validate_shape
    raise BadShapeMetadata(
databroker.mongo_normalized.BadShapeMetadata: For data key test_dev1-test_sig shape (2, 3) does not match expected shape (2,).

I am not sure why the reported shape suddenly is (2,3).

when i make an api-call to api/v1/metadata/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data everything seems fine to me. The plan I ran was : RE(count([tdev],30))

{
  "data": {
    "id": "data",
    "attributes": {
      "ancestors": [
        "ef9d45fa-cbaa-4f59-b1b8-018eba762443",
        "primary"
      ],
      "structure_family": "container",
      "specs": [
        {
          "name": "xarray_dataset",
          "version": null
        }
      ],
      "metadata": {
        "descriptors": [
          {
            "configuration": {
              "test_dev1": {
                "data": {},
                "timestamps": {},
                "data_keys": {}
              }
            },
            "data_keys": {
              "test_dev1-test_sig": {
                "source": "soft://test_dev1-test_sig",
                "dtype": "array",
                "dtype_descr": [
                  [
                    "name",
                    "<U10"
                  ],
                  [
                    "age",
                    "<i4"
                  ],
                  [
                    "weight",
                    "<f4"
                  ]
                ],
                "shape": [
                  2
                ],
                "object_name": "test_dev1"
              }
            },
            "name": "primary",
            "object_keys": {
              "test_dev1": [
                "test_dev1-test_sig"
              ]
            },
            "run_start": "ef9d45fa-cbaa-4f59-b1b8-018eba762443",
            "time": 1730984149.8866372,
            "uid": "9ddcdb3d-c38a-4864-89c9-7b7feba12c52",
            "hints": {
              "test_dev1": {
                "fields": [
                  "test_dev1-test_sig"
                ]
              }
            }
          }
        ],
        "stream_name": "primary",
        "attrs": {
          "stream_name": "primary"
        }
      },
      "structure": {
        "contents": {
          "time": {
            "id": "time",
            "attributes": {
              "ancestors": [
                "ef9d45fa-cbaa-4f59-b1b8-018eba762443",
                "primary",
                "data"
              ],
              "structure_family": "array",
              "specs": [
                {
                  "name": "xarray_coord",
                  "version": null
                }
              ],
              "metadata": {
                "attrs": {}
              },
              "structure": {
                "data_type": {
                  "endianness": "little",
                  "kind": "f",
                  "itemsize": 8,
                  "dt_units": null
                },
                "chunks": [
                  [
                    30
                  ]
                ],
                "shape": [
                  30
                ],
                "dims": [
                  "time"
                ],
                "resizable": false
              },
              "sorting": null,
              "data_sources": null
            },
            "links": {
              "self": "http://localhost:8000/api/v1/metadata/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data/time",
              "full": "http://localhost:8000/api/v1/array/full/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data/time",
              "block": "http://localhost:8000/api/v1/array/block/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data/time?block={0}"
            },
            "meta": null
          },
          "test_dev1-test_sig": {
            "id": "test_dev1-test_sig",
            "attributes": {
              "ancestors": [
                "ef9d45fa-cbaa-4f59-b1b8-018eba762443",
                "primary",
                "data"
              ],
              "structure_family": "array",
              "specs": [
                {
                  "name": "xarray_data_var",
                  "version": null
                }
              ],
              "metadata": {
                "attrs": {
                  "object": "test_dev1"
                }
              },
              "structure": {
                "data_type": {
                  "itemsize": 48,
                  "fields": [
                    {
                      "name": "name",
                      "dtype": {
                        "endianness": "little",
                        "kind": "U",
                        "itemsize": 40,
                        "dt_units": null
                      },
                      "shape": null
                    },
                    {
                      "name": "age",
                      "dtype": {
                        "endianness": "little",
                        "kind": "i",
                        "itemsize": 4,
                        "dt_units": null
                      },
                      "shape": null
                    },
                    {
                      "name": "weight",
                      "dtype": {
                        "endianness": "little",
                        "kind": "f",
                        "itemsize": 4,
                        "dt_units": null
                      },
                      "shape": null
                    }
                  ]
                },
                "chunks": [
                  [
                    30
                  ],
                  [
                    2
                  ]
                ],
                "shape": [
                  30,
                  2
                ],
                "dims": [
                  "time",
                  "dim_0"
                ],
                "resizable": false
              },
              "sorting": null,
              "data_sources": null
            },
            "links": {
              "self": "http://localhost:8000/api/v1/metadata/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data/test_dev1-test_sig",
              "full": "http://localhost:8000/api/v1/array/full/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data/test_dev1-test_sig",
              "block": "http://localhost:8000/api/v1/array/block/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data/test_dev1-test_sig?block={0},{1}"
            },
            "meta": null
          }
        },
        "count": 2
      },
      "sorting": null,
      "data_sources": null
    },
    "links": {
      "self": "http://localhost:8000/api/v1/metadata/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data",
      "search": "http://localhost:8000/api/v1/search/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data",
      "full": "http://localhost:8000/api/v1/container/full/ef9d45fa-cbaa-4f59-b1b8-018eba762443/primary/data"
    },
    "meta": null
  },
  "error": null,
  "links": null,
  "meta": {}
}