rucio / rucio

Rucio - Scientific Data Management
http://rucio.cern.ch
Apache License 2.0
246 stars 313 forks source link

Support for "in" and "contains" operations for metadata filter queries #6841

Open maxnoe opened 4 months ago

maxnoe commented 4 months ago

Description

Imagine custom metadata (e.g. using the JSON backend) like this:

did_client.set_metadata_bulk(scope, name1, meta={"categories": ["foo", "bar"], "format": "fits"})
did_client.set_metadata_bulk(scope, name2, meta={"categories": ["bar", "baz"], "format": "ecsv"})

We'd like to support filters with operations in and contains, i.e. for the list metadata item, querying for contains:

did_client.list_dids(scope, filters={"categories.contains", "bar"})

or "in" for the format:

did_client.list_dids(scope, filters={"format.in", ["fits", "ecsv"]})

The first (contains) is impossible currently I think. The second form is "only" a more concise version of listing multiple dictionaries:

did_client.list_dids(scope, filters=[{"format", "fits"}, {"format": "ecsv"}])

but this format gets unwieldy pretty fast if combined with other conditions.

Motivation

Enabling more complex queries on metadata.

Change

Support for the new operators added to metadata plugins.

bari12 commented 4 months ago

Hi @maxnoe this is definitely something we want to do. @Geogouz just started to work on the metadata this month.