Closed gwbischof closed 4 years ago
That's a good question. I think that consumers and readers should not perform any proactive validation. If they hit an unrecoverable error, it's worth doing some extra investigation to provide a specific error message, as this PR does, but I think it is fair to assume that the documents they read are valid unless proven otherwise.
As I mentioned on Slack but will rehash here for the sake of a public record, I think we should validate when the documents are composed (in event_model.compose_run
) and maybe when they are persisted (in the suitcases). But any validation work that we do when the documents are accessed is needlessly repeated every time they are accessed, and the notice also comes too late. It's better to fail when the documents are first created and/or persisted because the user is immediately notified about the problem, and you only pay for the performance overhead once.
So, as far as the Filler and databroker are concerned, I think it's OK to have a situation where malformed documents lead to keys silently not being filled. As a separate matter, we should review our validation functions to ensure that they would flag malformed documents at compose/write time, and then we should comb the existing databases for irregularities and decide what to do about the invalid documents that we find.
[Edit: cc @tacaswell for a gut-check here.]
The old databroker iterates the event['data'].keys() instead of descriptor['data_keys'].keys() (the way it is done here in event model) https://github.com/bluesky/databroker/blob/master/databroker/headersource/mongo_core.py#L56 Doing it this way could prevent it from failing silently.
The gut checks out. Don't fully follow the last paragraph about the Filler though.
I think the code would be simpler if we included the keys of the descriptor, the data, filled, (and the timestamps) with the same text rather than doing the book keeping to raise just the rightly worded exception. All we really know is that something that is consistent is not, but we can not really be sure which of the inconsistent things in right (and going with majority voting is not always correct).
Fixes #151