Fix publish/delete of unindexed datasets

dbutenhof commented 1 year ago

PBENCH-1247

The bulk Elasticsearch API operations (update and delete) have never had clean handling of un-indexed datasets. The functional tests, for example, expected "publish" of the archive-only datasets to fail. This was partly because I did too much "common stuff" inside the bulk action generator. Additionally, a dataset where indexing fails becomes somewhat "stranded" as we expect it to be indexed, but it isn't.

This PR refactors the checks in order to move some common operations out of the action generators with additional checks for eligibility and cleanup. We don't prevent publish, for example, due to lack of an index map if the dataset is archive-only, or if indexing failed, and we'll still do the necessary finalization.

webbnh commented 1 year ago

The tests look good. I'm assuming based on your comments that you have an update coming, so I'll wait on approving for that.

dbutenhof commented 1 year ago

The tests look good.

There you go, jinxing it again ...

I found some holes, fixed and cleaned up, and found some tests weren't doing what they were supposed to do.

I'm assuming based on your comments that you have an update coming, so I'll wait on approving for that.

Yeah.

distributed-system-analysis / pbench

Fix publish/delete of unindexed datasets #3524