getodk / central-backend

Node.js based backend for ODK Central
https://docs.getodk.org/central-intro/
Apache License 2.0
50 stars 73 forks source link

Add tests of single submission generating multiple entity.error events #1200

Open matthew-white opened 1 week ago

matthew-white commented 1 week ago

In #1195, I had to skip a few tests because the tests were resulting in more entity.error events than before. This was confusing to me, because I thought that a single submission version could result in at most one entity.error.

However, looking at Entities._processSubmissionEvent(), I don't actually see any logic along those lines. The function will return early if the submission has already resulted in an entity version, but it doesn't check whether the submission has already resulted in an entity.error. I don't think there's even an easy way to do so, as we don't have an index on submissionDefId in the audits table.

A couple of the tests that failed referenced getodk/central#547. Reading that issue, it sounds like we decided to focus on the processing of pending submissions when the approvalRequired flag is toggled. That mechanism won't reprocess a submission if it has already resulted in an entity.error. I think that was intended to prevent unexpected reprocessing of entity updates, but it also has the effect of preventing multiple entity.error events.

However, even though multiple entity.error events aren't possible through that mechanism, there are other ways that a single submission version can result in multiple entity.error events. It can happen if an invalid submission is approved multiple times:

I don't think we necessarily need to lock down this behavior. Maybe it's actually sort of nice to be able to force an individual submission to be reprocessed when the submission resulted in an error. However, while reviewing #1196, @ktuite and I discussed adding tests that document that behavior. That's what this issue is for.

There's another case that's interesting to highlight. Whenever a submission is created, it is at least partially processed for entities. For example, if the submission specifies an entity create, and submission approval is required, then the submission.create event won't result in a new entity — the submission won't be fully processed — but it will still be partially processed. If there's then a submission.update event to approve the submission, the submission will be reprocessed, this time fully. The important thing is that even when a submission is partially processed, it can result in an entity.error for certain basic errors. These are the errors that are checked before Entities._createEntity() and/or Entities._updateEntity() are run. For example, if an entity create is attempted using a form that only specifies entity updates, that will result in an entity.error, even if it's a submission.create event and submission approval is required:

Because these checks are completed early, a submission approval can result in an entity.error even if the entity list doesn't require submission approval. You can see that by repeating the steps above for an entity list that doesn't require submission approval.

The reason why #1195 resulted in more entity.error events is that the validation of the UUID was moved to earlier in Entities._processSubmissionEvent(). The UUID is needed to lock the entity, and the entity is locked before Entities._createEntity() or Entities._updateEntity() is run. Previously, processing a submission.create event would never reach the UUID validation if the entity list required submission approval, given how partial processing works. However, now that the UUID validation happens earlier, at the same time as some other basic checks, a submission.create event can result in an entity.error for an invalid UUID even if submission approval is required.

We could avoid that, reverting to the previous behavior. To do so, we would still attempt to parse the UUID early on in order to lock the entity, but if parsing failed, we would suppress the error. That would allow Entities._createEntity() and Entities._updateEntity() to operate as they did before, failing on an invalid UUID only if they actually reached the validation and didn't return early. However, that would be extra logic, and I don't think it's needed. I think it's OK that even partially processing a submission for entities can result in an entity.error for certain basic errors, and I think it's fine for UUID validation to be one of those basic errors.