Closed vreuter closed 1 year ago
Currently needed to optimally resolve #350
I guess the big picture is this:
Looper can submit multiple pipelines for a sample. each one has a piface, and a schema. Ideally, if a sample fails for one schema, it should only not submit for that pipeline -- the others still work.
However. 99.9% of use cases will be for one pipeline per sample. Given that introducing further logic to looper will increase code complexity, and that I'm looking to decrease complexity, I would advocate that this is a case where it would be fine to just say "validation error" and choke altogether, because that's simple, and it's possible that if we programmed this feature, it may never actually be used in real life. So I would only solve the problem the "right way" if it was:
I think this is unlikely to be the case so I'd just do the easier thing and say "you must be entirely valid to run at all".
In light of these, here's how I'd currently answer the above 3 questions:
a) Yes, on grounds of simplicity (unless it is very simple to do it the other way). b) I'd be fine with changing it so that if schema is not found, we should fail. c) "any"-like is fine for this use case, again on grounds of simplicity.
This approach fails completely on any validation failure. This is the simplest, easiest implementation. Given that we do it this way, maybe we could do the validation on the whole Project level, instead of on individual samples. This would make things simpler, and also speed up processing.
This was closed with #352
As currently stands, failure of validation against any one of the schemas available will cause an early return False from the loop over samples and over schemas. However, any schema that cannot be read will simply be skipped. This seems internally contradictory (i.e., the first requires all(success) while the second permits sample passage if nothing succeeds). I guess there are multiple questions.... a) Do we want to keep the short-circuit behavior, returning False when a single sample fails? b) Do we want to continue allow passage of samples that fail to validate at all (e.g., if every schema fails to be found/read)? c) Should the validation be all or any like? That is, should the sample be allowed to proceed if it validates against a single schema, or only if it validates against all schemas?
@nsheff what do you think?