java-json-tools / json-schema-validator

A JSON Schema validation implementation in pure Java, which aims for correctness and performance, in that order
http://json-schema-validator.herokuapp.com/
Other
1.63k stars 399 forks source link

Possible to customize messages? #55

Closed dsjellz closed 10 years ago

dsjellz commented 11 years ago

As far as i can see, all error messages are stored as enumerations and cannot be modified via any configuration? Is this correct?

For example, changing the message from 'input is not a valid email address' to 'The value you have entered is not a valid email address'.

Thanks!

fge commented 11 years ago

Hello,

As you say, at this moment it is not possible to customize error messages per se, the only possibility is to write your own pieces and create a customized ValidationConfiguration with these.

Changing this behaviour would require an overridable catalog of messages. Unfortunately, the readily available mechanism in Java (resource bundles) have some limitations in this regard: for instance, if a key does not exist, you cannot tell to try a "default catalog".

dsjellz commented 11 years ago

Thanks for the response. I thought that may have been the case but wanted to check with you before I started trying to solve a problem that didn't exist.

I made some modifications to a fork of this library where the loading configuration would take an optional .setMessageProvider() and then pass it to the ValidationProcessor to be used during error message interpolation, but ran into an issue where the processor was not used during all constraint validations (ie: number min/max length).

I'd love to see this baked into the library, but for the time being I'm just translating the messages you return into the ones i would like them to be. Thanks for your responsiveness as usual!

fge commented 11 years ago

MessageProvider is actually a template message provider, it is used by processors so that a context-dependent template is provided. The content you put into the ProcessingMessage using .message() is something else.

I don't know whether resource bundles would fit the bill. The problem is, if you want to override one message only, you need to copy the full resource bundle file, modify it and make sure it is in the classpath prior to the original one.

fge commented 11 years ago

OK, I have played a little with bundles and I think it can be fit in. Now I need to figure out at which level and how.

fge commented 11 years ago

News: I am in the process of converting all the stack from using enums to using resource bundles.

-core is 80% done; when it is done, I'll do it with -validator, and release a dev version once all the conversion is done (this, and OSGI capabilities, are two quite big changes!)

dsjellz commented 11 years ago

That's awesome news! I am about 80% done building the necessary functionality to convert your messages to the ones I wanted them to be, but if you're going to bake that directly into the library that is obviously preferable.

fge commented 11 years ago

Yup, I must admit that it was shortsighted on my side as well. I didn't even think about the possibility for messages to be customized apart from rewriting syntax checkers etc... Now it will be much more flexible.

I am now done with the whole of -core. Note however that the next version of -core and -validator will still be a dev version as far as I'm concerned since the middle number is odd... I have one more very important thing to do before 1.2.0 (core)/2.2.0 (validator) hit the shelves. Or maybe I'll postpone that to the next version, I don't know yet.

I am late on the roadmaps...

dsjellz commented 11 years ago

Awesome! I will check this out over the next day or so.

dsjellz commented 11 years ago

Perhaps I am missing something, but I don't see a way to provide my own properties files to use for error messages. It seems like ValidationBundles.java has them hardcoded, which means I would still need to modify your library in order to change the error messages.

I guess in my mind I thought you would have an option in ValidationConfigurationBuilder to supply alternative properties files for the 4 different types of error messages you provide (validationCfg, factoryCfg, format, validation). Did you envision another way of doing this?

fge commented 11 years ago

Well, I haven't written instructions yet ;)

The plan is basically to provide a new properties file which is a copy of the existing one and modify it at will, and make sure it is loaded first in the classpath... It is probably not an idea solution!

The system can change, since this is still a beta version. What would you envision as an API? This is probably the most difficult part to get right here.

dsjellz commented 11 years ago

I just thought I would try it and see if I can figure it out. I'm pretty excited for this change, so forgive my eagerness :).

I guess I would consider this part of 'ValidationConfiguration', so I would expect there to be properties in ValidationConfigurationBuilder pertaining to each one of the ValidationBundles (validationCfg, factoryCfg, format, validation). Maybe something along the lines of: .setValidationMessages() .setValidationConfigMessages() .setFactoryMessages() .setFormatMessages

For example: final ValidationConfiguration cfg = ValidationConfiguration.newBuilder().setValidationMessages("path/to/validation.properties")

If a user chose not to set any of those messages, it would default to the ones that you have provided. Of course it would be up to the developer setting the validationMessages to make sure that property names you have set up are provided. (ie: ARRAY_IS_TOO_SHORT, VALUE_NOT_IN_ENUM).

You can of obviously implement this however you'd like as you are the developer, but if you want me to take a swing at it and open a pull request, I can certainly do that.

fge commented 11 years ago

Yes, that's one problem I have: overwriting all values would be necessary... Unfortunately, if you don't provide them all in a ResourceBundle, an unchecked exception is thrown, which is Not Good(tm).

I was rather thinking about allowing to override only some keys, from a property file or map, and build a bundle with the overriden keys. One advantage is that you can also add your own.

The problem remains though that there is the need to document existing keys... I guess a copy of the property file and a link to it in the javadoc would be the way to go -- just as I do for example source code.

Oh, and there is that locale stuff too. Not a simple problem!

Ideally, there is already a library out there doing that... If there were, I'd be glad not to reinvent the wheel! I'll need to look around.

fge commented 11 years ago

Actually, ResourceBundle has .setParent() and .getParent(). Maybe it will be more simple than I believe it to be!

fge commented 11 years ago

Oh, as to pull requests, they are always welcome ;) But the core structure for a bundle needs to go into -core, however.

fge commented 11 years ago

OK, well, there are three solutions:

I am pretty inclined to go with the third option! The lack of a simple message bundle implementation is rather disturbing...

fge commented 11 years ago

Well, the third option it is: https://github.com/fge/msg-simple

Alternatives are just not quite there yet... I'll do a first version quite soon, I already have the API "in my head" ;)

fge commented 11 years ago

OK, it's not going to be very long... I have a basic, map-based message source, I need to add a properties-based source and a first version will be basically ready!

I am quite surprised not to find any simple library as the one I am doing...

fge commented 11 years ago

I am ready to release 0.1 ;)

Care to comment on the current status of the API? It is quite simple, there is no i18n support yet (planned for the next versions), but this is a usable, tested implementation ;)

dsjellz commented 11 years ago

Seems very straightforward and clear to me. Seems easy to prepend/append sources, and load from resource/path.

As you say, it is quite simple, but seems solid to me. Nice work in such a short amount of time.

fge commented 11 years ago

OK, I'm starting the migration now. It will take some time: in particular, I have to change all dictionaries for syntax checkers/format attributes, the constructors for keyword validators (which are called by reflection), etc etc.

fge commented 11 years ago

All done, both for -core and -validator, but there is some work to do.

In particular, I'd like to unify the key style. Right now there are keys likeThis and keys LIKE_THIS, and also keys like.ThisToo.

And I've yet to decide on a standard before I spill out the next devel version of -validator. Do you have any lights on this?

fge commented 11 years ago

OK, there are still things I need to do.

For one, messages are for now located at the top of the hierarchy (/) and that is not good. I'll have to move them around...

And I have yet to decide on a policy. Note that in the near future, msg-simple will be extended to allow i18n. I may therefore switch -core and -validation to use the i18n versions before 2.2 is out.

fge commented 11 years ago

Hello,

So, some news: msg-simple is now up to 0.2, with i18n bundles; -core already uses this version in master.

I have yet to decide on a policy for keys, though.

RackerWilliams commented 11 years ago

Hey guys. Are there any plans to have messages be formated strings which include message properties. For example, would it be possible to have something like:

VALUE_NOT_IN_ENUM="instance does not match any enum value, I expected one of %s"
ARRAY_IS_TOO_LONG="array has more than %d elements"

Or is that something I need to put together outside the validator?

Thanks

fge commented 11 years ago

@RackerWilliams well, this is a definite possibility, and just as you say. Have a look here:

https://github.com/fge/msg-simple

This project can do .printf(), so additional information can be supplied in the message itself.

Now, if you talk about a specific case, for instance ARRAY_IS_TOO_LONG, right now the full JSON message appears as such:

{
  "level" : "error",
  "schema" : {
    "loadingURI" : "#",
    "pointer" : ""
  },
  "instance" : {
    "pointer" : ""
  },
  "domain" : "validation",
  "keyword" : "maxItems",
  "message" : "array has too many elements",
  "maxItems" : 3,
  "found" : 4
}

So, you do get all the information, but it is scattered across several fields...

As customized exception thresholds are possible, you can throw an exception at Level.ERROR but then the .getMessage() method only outputs the message member value of the above output; which is clearly not enough.

My initial plan was to modify .getMessage() for an exception so that it returns something like:

ERROR: array has too many elements
    schema: {"loadingURI":"#","pointer":""}
    instance:{"pointer":""}
    domain:"validation"
    keyword:"maxItems"
    maxItems:3
    found:4

That is, the level and message fields are on the same line and all the rest on different lines.

Now that msg-simple has reached the state it has, and given that I intend to depend on it, a single-line error message as you intend may be possible...

That is open to debate. I wish to keep the JSON information as it is, so it may turn out that information might be duplicated. As in:

{
  "level" : "error",
  "schema" : {
    "loadingURI" : "#",
    "pointer" : ""
  },
  "instance" : {
    "pointer" : ""
  },
  "domain" : "validation",
  "keyword" : "maxItems",
  "message" : "array has more than 3 elements (found: 4)",
  "maxItems" : 3,
  "found" : 4
}

Comments?

RackerWilliams commented 11 years ago

Having a single summary string message as you've illustrated above would be awesome for my particular use case, I'd vote for that :+1:

My initial plan was to capture exceptions at Level.error grab ahold of the ProcessingMessage from the ProcessingException and create a single summary message myself, but that seems tedious because I need to study all of the validators to see what information is available for each in order to build a good summary message. It's also not clear to me what sort of guarantees I have in terms of the JSON message format when new versions of the validator come out or when new errors are created. I'm still willing to do the work of formatting summary messages on my end, but I'm wondering if the efforts are warranted, maybe my time is better spent helping out on your end?

Let me know...

Thanks,

fge commented 11 years ago

@dsjellz you were the original poster on this issue, what is your take on this?

@RackerWilliams that is a tough call, and to be honest, you and @dsjellz make me have second thougts about error messages and their exploitability. And you are oh-so-right about this:

I need to study all of the validators to see what information is available for each

I can only acknowledge that this is a problem. Should I transform all error messages so that they account for parameters using msg-simple's .printf() capability, documenting the parameters is another problem altogether. Maybe by implementing a @Documented annotation?

I don't know the definitive answer yet; I am open to ideas. At this stage I have yet to update -core with the newest msg-simple version; and, as a consequence, -validator will evolve as well.

NOTE: you refer to message keys above, please bear in mind that as far as I am concerned, they are NOT set in stone... What version are you using?

dsjellz commented 11 years ago

I'd have to give this some thought... To be perfectly honest, my initial intent is to do exactly what @RackerWilliams was doing and return a message similar to: "array has more than 3 elements (found: 4)".

While I knew you were working on adding custom error messages from a properties file, it still seemed like I was going to have to do some manipulation to inject the values supplied from either the schema or the payload into the error message that would ultimately be returned to the user.

I'm currently running the the entire ProcessingResult through a translator that handles this for me, but it was some work to study all the JSON information returned from each Validator. Documenting the information might be helpful, but would also be some work to create/maintain.

fge commented 11 years ago

@dsjellz while it may be some work, it is only fair that it is work that needs to be done; what I am looking for is the most efficient way to do it ;) I do not resent doing "repetitive work" (that is, once per keyword) as long as the end result is agreed upon!

What do you think? This also stands for @RackerWilliams.

RackerWilliams commented 11 years ago

I think there are two issues:

  1. We want a simple summary message that contains all relevant information "array has more than 3 elements (found: 4)"
  2. We want to provide a way to customize/localize the message. "la colección tiene mas de 3 elementos (4 se han encontrado)"

If you want to solve for 2, then you need to deal with compound localized messages. The default method for doing this is suggested here:

http://docs.oracle.com/javase/tutorial/i18n/format/messageFormat.html

The printf solution we discussed earlier would work, but it would mean that the message arguments must always be in the same order:

"array has more than %d elements (found: %d)"

This may not work well in some languages. The format dictated by

http://docs.oracle.com/javase/7/docs/api/java/text/MessageFormat.html

would look something like this

"array has more than {0, number, integer} elements (found: {1, number, integer})"

Here order doesn't matter, I can have a localization that lists the number found before the maximum number. Another nice thing about the format is that it's pretty much self documenting. If I had a property file with those kinds of strings, I could very easily customize / localize it -- so I'm not too worried about the documentation part. That said, you need to maintain the contract for those kinds of messages (arg 0 is max elements, arg 1 is total found).

Honestly, though, just having a solution for 1 would solve my current issue. It may make sense to attack 2 at the same time since it looks like you'll be modifying the same code anyway...if you want to split up the work, I'm more than happy to help with the tedious bits.

BTW, I'm currently using 2.1.4.

fge commented 11 years ago

@RackerWilliams msg-simple has locale support ;) So localized messages are not the problem here. If no locale is specified, it will take the current JVM locale and "descend" (ie, pt_BR, pt then the root locale) if it can't find the message for the exact locale until it finds one.

And as it uses printf (well, Formatter to be precise), the message will be different (and imho easier to write as well). `Formatter also supports argument positioning:

System.out.printf("arg 2 is %2$d, arg 1 is %1$d", 20, 50);

prints out:

arg 2 is 50, arg 1 is 20

Now, as to using 2.1.4, ouch... This is a beta version, the next stable is supposed to be 2.2.0, but I suppose I have to change my plans now :p

fge commented 11 years ago

OK, status report...

This will also be needed in -core since it has syntax validation. I am done with updating the core messages themselves (misconfiguration, null checks etc), and now I am onto the syntax stuff.

The ProcessingMessage class will have to evolve, though. Right now, there are only .put() methods to add a key/value pair; there would probably be the need of new methods... I am thinking about .putArgument(), which would, in addition to .put()int the value, would also append to an ArrayList<Object> of arguments.

The .getMessage() would then do:

// list supposedly called "args" here
public String getMessage()
{
    final String msg = map.get("message");
    if (msg == null)
        return "(no message)";
    if (args.isEmpty())
        return msg;
    try {
        return new Formatter().format(msg, args.toArray()).toString();
    } catch (MissingFormatArgumentException ignored) {
        return msg;
    }
}

And in order to use it, you would do:

final ProcessingMessage message = new ProcessingMessage()
    .message("Hello %s!") // the message
    .putArgument("greeted", "world") // put, and add as an argument
    .put("parboiled", "rocks!"); // put, but DO NOT add as an argument

message.getMessage(); // returns "Hello world!"

As to i18n, is it up to msg-simple's MessageBundle; it would .getMessage() the correct message given the locale and inject it into ProcessingMessage using its .message() method.

Comments?

fge commented 11 years ago

OK, see here:

https://github.com/fge/json-schema-core/commit/d6d768d0be442cb4b16d7484025728aa369705b5

Now, I have a problem: testing the messages in syntax validation :p At least, once I'll have done that, I'll be able to reuse this for -validator...

What do you think?

RackerWilliams commented 11 years ago

Wow, cool.

Didn't realize the formater supported argument positioning so that settles that concern. As for using 2.1.4, I'm experimenting with JSON validation for now so not worried about using a BETA version -- I'll upgrade eventually.

Looking forward to this making it's way to the validator :-)

fge commented 11 years ago

@RackerWilliams this will have to wait a little ;) I have only "just" finished to put the testing infrastructure in place, and am now starting to convert all validators.

One thing I am still not sure about is how to document these messages. It could be in the javadoc, of course, but I am starting to feel this is not quite the right place... I'm open to ideas.

RackerWilliams commented 11 years ago

No rush. I'm just previewing stuff for now, I'm just glad you're working on the feature :-)

As for documenting the messages, I honestly think that the default properties file is the place to do it. All the messages are in a single place and you're guaranteed the data is up to date. You should be able to tell just by looking at it how things work..besides property files can have comments right?

fge commented 11 years ago

OK, done! This has taken quite some time, but now ALL messages are converted to the new message bundles, therefore i18n/printf() capable etc. The full list is here.

Parameters and their order are not documented but they should be quite obvious from the contents alone...

In order to add a locale, it is the same principle as for a classical ResourceBundle, but the property files have two differences which I have already mentioned:

Note: a non existing key does not throw an exception, and neither is a missing/invalid format parameter.

The default message source is available by calling .getValidationMessages() on a ValidationConfiguration (or MessageBundleFactory.getBundle(JsonSchemaValidationBundle.class), this is the default). See here for changing a bundle. Quickpatching via a Map is possible.

Hope this helps!

RackerWilliams commented 11 years ago

This is awesome. Thank you!

I'll be integrating with 2.1.5 later on today ;-)

fge commented 11 years ago

Note that the next version of msg-simple will have loading message sources with configurable expiry, too (tested!). Which means your messages may even change on the fly... Some applications actually require that and it was only 100 lines of code (not counting tests) to get it done ;)

dsjellz commented 11 years ago

This is fantastic! I will do some messing around with this over the next few days and see how it works.

fge commented 11 years ago

There is better ;) I have now coded the full support (always in msg-simple) to be backwards-compatible with a ResourceBundle: I can read property files from any encoding, and use MessageFormat. All tested!

I'll release a version of it in a few hours.

aruizca commented 11 years ago

Why is this issue still open? Is there any documentation about how to customize validation messages.

fge commented 11 years ago

Sorry for the long wait. No there isn't such documentation. Basically, it means using the capabilities of msg-simple. I'll think about how to document it correctly. Probably in the javadoc.

As to why the issue is still open, I am hopefully waiting for user inputs. If you like, I can write a quick post.

fge commented 10 years ago

Done!

iyu-zz commented 8 years ago

I'm working on a use case where this feature ("custom validation error messages") can be very useful. After doing the Json Schema validation and getting a report with list of validation errors, i want to override the messages with a set of custom messages, and then push the messages list to REST API response. By reading the logs here, it looks like this lib + msg-simple may give me what i want. Could you provide a simple example for doing this? Appreciate it.