Closed prjemian closed 2 years ago
You would have to build the inverse lookup table. We have talked about writing the code to do this, but never have.
For now, it sounds like custom mongoquery might be the most efficient.
An extremely easy and contemporary harvest is from the take_image()
plan which captures the run's uid and learns the HDF5 file name from the resource: https://github.com/BCDA-APS/bdp_controls/blob/bef79d12d59bcbf78ccca1d8b8214d4c95181a70/qserver/instrument/plans/image_acquisition.py#L86-L98
This is the way we can capture this information for the future. Suggestions to save this info, from the bluesky developers on Slack, include:
Of these, local TEXT file seems extremely easy.
Text file could actually be structured, such as YAML, to make it fast to append new entries and easy to load in Python:
In [30]: import yaml
In [31]: s = """
...: a: 1
...: b: 2
...: """
In [33]: yaml.load(s, yaml.Loader)
Out[33]: {'a': 1, 'b': 2}
Given an HDF5 file name /tmp/docker_ioc/iocbdpad/tmp/adsimdet/2022/03/29/a4700b27-2666-44cf-a86f_000.h5
from run uid=155d3536-f225-4c17-852a-6367792830f4
, the entry would be:
a4700b27-2666-44cf-a86f_000: 155d3536-f225-4c17-852a-6367792830f4
We assume here that each HDF5 file will only appear in a single run uid
. If we further assume that these identifiers are truly unique uuid
codes, then we can record the swapped pair as well and allow for searches given either run uid
or HDF5 file base name, find the other one:
a4700b27-2666-44cf-a86f_000: 155d3536-f225-4c17-852a-6367792830f4
155d3536-f225-4c17-852a-6367792830f4: a4700b27-2666-44cf-a86f_000
If proceeding with a mongoquery, see see: https://docs.mongodb.com/manual/reference/operator/query/
In [27]: dl = list(cat.v1[-1].documents())
In [28]: dl[2]
Out[28]:
('resource',
{'spec': 'AD_HDF5',
'root': '/',
'resource_path': 'tmp/docker_ioc/iocbdpad/tmp/adsimdet/2022/03/29/a4700b27-2666-44cf-a86f_000.h5',
'resource_kwargs': {'frame_per_point': 1},
'path_semantics': 'posix',
'uid': '51d30cff-4580-4dda-a58a-2e05ea724886',
'run_start': '155d3536-f225-4c17-852a-6367792830f4'})
In [29]: dl[3]
Out[29]:
('datum',
{'datum_id': '51d30cff-4580-4dda-a58a-2e05ea724886/0',
'datum_kwargs': {'point_number': 0},
'resource': '51d30cff-4580-4dda-a58a-2e05ea724886'})
fill out the mongoquery search dictionary {}
here:
In [51]: from apstools.utils import db_query
In [52]: db_query(cat, {})
Out[52]: bdp2022:
args:
asset_registry_db: mongodb://dbbluesky4.xray.aps.anl.gov:27017/bdp2022-bluesky
metadatastore_db: mongodb://dbbluesky4.xray.aps.anl.gov:27017/bdp2022-bluesky
name: bdp2022
description: ''
driver: databroker._drivers.mongo_normalized.BlueskyMongoCatalog
metadata:
catalog_dir: /home/beams/JEMIAN/.local/share/intake/
example writing YAML file from take_image()
plan:
(bdp2022) jemian@wow ~/.../bdp_controls/qserver $ tail -f xref_image_run.yml
# file: xref_image_run.yml
# created: 2022-03-29 16:03:06.119378
# purpose: cross-reference bluesky run uid and HDF5 file name
00714a91-c33e-4e7b-90fd-2e8f385bebc9: add9e2d0-7f20-419d-a6a8_000
add9e2d0-7f20-419d-a6a8_000: 00714a91-c33e-4e7b-90fd-2e8f385bebc9
c96b08be-bf17-4623-9ee7-062effddbde9: 32b8278b-eded-42c1-85e2_000
32b8278b-eded-42c1-85e2_000: c96b08be-bf17-4623-9ee7-062effddbde9
This is a high priority question for the BDP project.