DDMAL / musiclibs

:guitar: Searching IIIF Manifests
Other
6 stars 2 forks source link

Extract IIIF Validator #69

Closed ahankinson closed 8 years ago

ahankinson commented 8 years ago

Would it be feasible to extract the IIIF Validator out to its own component? It may be useful to others...

agpar commented 8 years ago

Yes. It's pretty much already it's own component. It only has 1 dependency (the python library voluptuous).

Right now it's incomplete - it only checks the syntax of the technical information, metadata, and sequences. So there are a few components that it won't even look at if they appear, since Diva doesn't use them.

I'd want to re factor it and make it easier to customize its behaviour before we released it. Would probably be a few days of work. I think it's a good idea, once we pass the july deadline. One of the problems with the official IIIF validator is it's pretty hard to set up and actually validate documents with it. Ours would be much more light-weight and easy to plug in.

ahankinson commented 8 years ago

So I've changed this to a 'for sure' thing. I think it will be good to develop and publish this as a separate component, so you can add this to the 'to-do' list for post-presentations in July.

Once we get a bunch of manifests loaded up on musiclibs.net, I'm going to send it around to the IIIF mailing list, probably in about a week or so. I'd like to also mention the validator, and that we'll be releasing it separately.

agpar commented 8 years ago

Sounds good. It might actually take quite a bit of work, but there should definitely be time for it!

agpar commented 8 years ago

Right now the validator is sort of bare bones - it doesn't return any warnings (all the SHOULD directives from the API are ignored). This is obviously because we don't need them... We just want to know if we're likely to be able to display the thing.

If we're going to release this, I can see myself going one of two ways:

  1. Keeping it barebones. The only work to do then is to document and make the API for the validator cleaner (a couple days work). The validator is good as a pluggable class for validation in an automated system, but not great for people working on their manifest output logic (since they won't get any SHOULD warnings).
  2. Fleshing it out to include the SHOULD directives as warnings. This would be quite a bit more work. The validator would still work well as a pluggable class (we can have a flag to ignore warnings) and is a better tool for those developing their manifest generation systems. I could see this being a week or more of work.

I found myself going for the latter on a local branch, but wondered what you felt about it @ahankinson.

ahankinson commented 8 years ago

No. 2, definitely. Plus points for fancy coloured output (that can be disabled) and logging levels (ERROR, WARN, RECOMMEND?)

ahankinson commented 8 years ago

We need a catchy name for it. Suggestions for "Vaildate McValidateface" will be ignored.

agpar commented 8 years ago

https://www.pastemagazine.com/blogs/lists/2014/05/kaiju-a-go-go-every-godzilla-monster-from-lamest-to-coolest.html

agpar commented 8 years ago

Is keeping the easy customizability of the validator important? Right now most validation functions are very bare bones (check if its a string or a list of strings), so also very easy to override. The more warnings and logging stuff we add, the more difficult it will be to override specific field/block behaviour.

Also, we do corrections while validating with the current validator. This is part of why it's cool as a bare bones thing - you can override the 'check if its a string or list of strings' function to add a clause saying 'if its library x, and they put a dict of strings here (which they shouldn't), join it down to a string and return that', and then we can store that correction.

I guess the tradeoff is, instead of just having a 'string or list of string' function that we can use for label, attribution, related, etc, we'd need a separate validation function for each, OR we'd need to compose functions in a fancy way (pass the validation functions into logging functions). The first is more verbose, the second is harder to customize.

I'm having a bit of trouble envisioning how we can keep the easy over-riding of behaviour, corrections, and a good warning/logging system.

ahankinson commented 8 years ago

I think the validator should be as strict as the spec. Any rewritten output should be documented as an example of how you might customize a validator for real world cases.

I don't know if this is a good suggestion, but would decorators for logging be a good idea here? That way we can raise errors in the strict validator, but someone else might just choose to decorate the method with a warning.

agpar commented 8 years ago

image

Starting to reach warning critical mass.

agpar commented 8 years ago

Btw the idea to include decorators to coerce errors to warnings, vice versa, and suppress errors is a really good one.

ahankinson commented 8 years ago

Great! Have we got a name and a separate repo yet?

agpar commented 8 years ago

No. Should I create one in under the DDMAL lab stuff? What about DDMAL IIIF Validator? Its cool to have the name in the name. Like standard meta language of new jersey or berkley software distribution. Also clarity.

ahankinson commented 8 years ago

I'm still partial to something along the lines of "tripoli" since I think it's a neat play on "Triple-Eye," and that someone is going to eventually name their software that in the IIIF community, so why not us? :)

For the examples you gave, they're more commonly known by their initialisms SML and BSD, and if we named it "DDMAL IIIF Validator" people might want to call it "DDMALIIIFV" which is a bit of a mouthful. Or "DIV" which is very close to "Diva" which is already taken in our namespace.

ahankinson commented 8 years ago

The tripoli functional validator, or "Tripoli-F Validator" :)

ahankinson commented 8 years ago

Or "Tripoli Formal Validator"

fujinaga commented 8 years ago

How about “tripoliv”?

On Jul 15, 2016, at 5:04 PM, Andrew Hankinson notifications@github.com wrote:

I'm still partial to something along the lines of "tripoli" since I think it's a neat play on "Triple-Eye," and that someone is going to eventually name their software that in the IIIF community, so why not us? :)

For the examples you gave, they're more commonly known by their initialisms SML and BSD, and if we named it "DDMAL IIIF Validator" people might want to call it "DDMALIIIFV" which is a bit of a mouthful. Or "DIV" which is very close to "Diva" which is already taken in our namespace.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

agpar commented 8 years ago

If I could remove the voluptuous requirement, do you think I should? I've had to build around it in so many ways - at this point nearly every shortcoming it has (for our purposes) has been bolstered by my own compensatory code. It also kind of gets in the way when trying to over-ride behaviour, as you need to know about the library and how to create a voluptuous.Schema object.

I think I could remove it pretty easily. It might actually be for the best.

All it is doing now is:

  1. Iterating over keys in a dict I'm providing it and calling my validation functions. This used to be more useful when I was passing it literals and types to validate things, but now I validate everything with my own functions (for easier over-riding), so voluptuous is not doing much that I couldn't do with a for-loop.
  2. Catching thrown Invalid errors and accumulating them (it also adds those neat paths you get on the errors). Since voluptuous does not support warnings, I've implemented almost exactly the same behaviour for our own warnings. My system does exactly the same thing - so at this point, we might as well stop raising errors, and instead accumulate our own Error objects.

After having written this out, my mind is pretty set on removing it.

ahankinson commented 8 years ago

Fewer dependencies are always better, as long as it doesn't reinvent the wheel.

I'm OK with removing it.

agpar commented 8 years ago

That done. Just need to send out a launch email.

https://github.com/DDMAL/tripoli http://validate.musiclibs.net