airr-community / airr-standards

AIRR Community Data Standards
https://docs.airr-community.org
Creative Commons Attribution 4.0 International
35 stars 23 forks source link

airr-tools validate airr stops after reporting first error #632

Open schristley opened 2 years ago

schristley commented 2 years ago

@javh Is it easy to do this and report all errors? It's very tedious having to fix one error at a time, I have to edit and run a conversion script each time, so would like to do bulk changes.

javh commented 2 years ago

Should be. I'll look at it tomorrow. If it worked differently before, then I made a mistake during the refactor.

javh commented 2 years ago

Okay, so here's what's going on with this.

It's not an interface change, it's how Schema.validate_object() works (which I didn't touch). It raises ValidationError upon the first failure. We would have to redesign validate_object() to store all the property errors and then do a single raise with the full list of messages at the end. It looks fixable (I say having not tried yet) with some (not huge) effort.

I'm not opposed to fixing this for v1.4, but because it does look like it's working as intended, my preference is to push this back to a v1.4.1 or something. I could really go either way though.

There's a separate problem with the logic in airr-tools validate in that it exits if there's a validation error in the first file and doesn't check subsequent files. That's an obvious bug and easy to fix. I'll put in a PR for that.

schristley commented 2 years ago

Ok, I see. I was thinking we had it before. It must be the multiple files I was think of. Yeah, modifying validate_object is tricky with the recursion.

schristley commented 2 years ago

I can see how both behaviors are useful. In a batch processing mode, you might want it to stop immediately if invalid, and not waste time validating the rest. So if we add, we should probably have a flag to pick.

javh commented 2 years ago

True. Let's come back to this after v1.4 then.