Closed grncdr closed 9 years ago
Good idea. However default messages should be attached to the type: type names are optional, so
typeMessages: {
Str: ':actual at :path must be a string.',
Identifier: ':actual at :path must contain 1-32 alphanumeric characters, hyphens, underscores or dots.'
}
is not reliable. Something like
Identifier.meta.validation = {message: '...'};
Then you can override the default message on a field basis as usual.
I thought about that, but passing in the custom messages at validation time (instead of type definition time) is valuable because it allows you to localize them. Might be worth supporting both? (e.g. default message attached to the type, but allow per-type overrides to be passed in to validate
)
I also think it's a reasonable trade-off to say that "if you want overridable error messages for a type, it must be named".
Localization...I knew, sooner or later the issue would be raised :)
Well, I think the problem should be solved outside the library. A general solution would be to exploit the meta info of the struct:
var Person = struct({
name: Str,
age: Num
});
// bundles are hard coded, ajax requested, etc
var bundles = {
'en_US': {
Str: ':actual at :path must be a string.',
Identifier: ':actual at :path must contain 1-32 alphanumeric characters, hyphens, underscores or dots.'
},
'it_IT': {
...
}
};
function getMessages(type, locale, overrides) {
// here you can write the best logic for your app
var messages = {};
var bundle = bundles[locale];
var props = type.meta.props;
for (var field in props) {
var type = props[field];
var name = t.util.getName(type);
messages[field] = overrides[field] || bundle[name];
}
return messages;
}
validate(value, Person, {messages: getMessages(Person, 'en_US')});
validate(value, Person, {messages: getMessages(Person, 'it_IT')});
What do you think?
Well, I think this issue [error localization] should be solved outside the library.
Absolutely, I'm not proposing a system for localizing error messages, just overriding them. That one enables the other doesn't mean it's the only use case. For example, it's also useful to override error messages for named types simply so that messages are consistent throughout all validations within a single locale. While one could do this by modifying all the built-in types:
var t = require('tcomb');
t.Str.meta.validation.message = "my custom message"
... to me that feels a lot messier than simply passing in a data-structure that the library will prefer over it's built-in messages.
Regarding the solution proposed above, the algorithm needs to be made recursive (so that the types of nested records/lists/etc are inspected for custom error messages), and it's building up a lot of garbage for each validate
call. (though a more "real world" solution could of course cache these things).
I can understand the desire to keep this module small, so maybe I will write a separate module that generates & caches the custom messages for validate
, but IMO it's simpler to support this functionality once here.
Absolutely, I'm not proposing a system for localizing error messages
I do. It would be great if we can find a general solution.
it's also useful to override error messages for named types simply so that messages are consistent throughout all validations within a single locale
I agree.
While one could do this by modifying all the built-in types
Honestly at the moment I'm not sure t.Str.meta.validation.message = "my custom message"
would be a decent solution. However, as you stated, it's more correct to attach a validation message to a type rather than a field, since the validation fails because of the type not of the field. Types carry informations and meanings. Can you show me an example where two fields with the same type need different error messages? Just to understand your real world use case.
to me that feels a lot messier than simply passing in a data-structure that the library will prefer over it's built-in messages
Ok, but in that case if you use the same type in different validations you must still pass around that data structure.
Regarding the solution proposed above, the algorithm needs to be made recursive
You are absolutely right. This needs to be thought through more.
I can understand the desire to keep this module small, so maybe I will write a separate module that generates & caches the custom messages for validate
That would be interesting. Anyway let's recap the requisites:
You propose an additional option typeMessages
:
validate(value, type, {
typeMessages: {
Str: ':actual at :path must be a string.',
Identifier: ':actual at :path must contain 1-32 alphanumeric characters, hyphens, underscores or dots.'
}
});
that solves (2) while (3) is already solved by the messages
option, right?
that solves (2) while (3) is already solved by messages, right?
Yes, and the ability to pass in messages
and typeMessages
at validation-time also solves (1) as far as I'm concerned. There's no need to do anything more specifically aimed at localization.
If we had typeMessages
, you wouldn't need the getMessages
function your example above, it just becomes a property access:
var bundles = {
'en_US': {
Str: ':actual at :path must be a string.',
Identifier: ':actual at :path must contain 1-32 alphanumeric characters, hyphens, underscores or dots.'
},
'it_IT': {
...
}
};
validate(value, type, { typeMessages: bundles[locale] })
Can you show me an example where two fields with the same type need different error messages?
Outside of localization, I don't have one. I'd be perfectly happy to have the same error message for the same type in all cases that I can currently think of.
Edit: same error message for the same type in the same call to validation
.
I see. Thus an option is having an hash containing all the types you use across the app and using it in all the validations without regard to the actual type being validated.
Pros:
typeMessages
in a particular validationtypeMessages
with messages
on a field basisCons:
name
(sometimes I'm tempted to make it required..)A question: :actual
should be localized or it isn't worth it?
p.s. I know I'm a little bit pedantic, but I want to be sure
A question: :actual should be localized or it isn't worth it?
If you mean the placeholder string ":actual"
then I can't see how it would be worth it. I'm not sure how you'd localize the value itself, given that it probably comes from a user. :smile:
Also, I haven't been using ":expected"
in my example messages because that way I don't have to worry about whether Num
is universally meaningful. Regarding the pro's and con's I agree 100% with what you have there. I wouldn't make the name
required (yet...) because not every user of tcomb
is going to use tcomb-validation
and even those that do might not care so much about the error messages being good (their loss).
I know I'm a little bit pedantic, but I want to be sure
I greatly appreciate pedantry when it comes to type-checking and validation! :+1:
Great! I'm upgrading the library to tcomb v0.3.1, I'll add this stuff asap.
Let's take it a step further.
If you mean the placeholder string ":actual" then I can't see how it would be worth it. I'm not sure how you'd localize the value itself, given that it probably comes from a user.
This is the problem. Say you have this type:
var Person = struct({
name: Str,
birthDate: Dat
});
and a form returning this value:
var value = {
name: 'giulio',
birthDate: '30/11/1973' // italian localization dd/mm/yyyy
};
value
before pass in to validate
:actual = '30/11/1973'
, not :actual = Fri Nov 30 1973 00:00:00 GMT+0100 (CET)
(Date.prototype.toString)Now take these APIs:
(1) parse(value, type, options) -> Obj
where value
comes from a form (an hash Str -> Str
) and Obj
contains the parsed values. Example:
parse({name: 'giulio', birthDate: '30/11/1973'}) -> {name: 'giulio', birthDate: new Date(1973, 10, 30)} // parse ok
parse({name: 'giulio', birthDate: '30/11/'}) -> {name: 'giulio', birthDate: '30/11/'} // parse KO
(parse
API doesn't exist at the moment)
(2) validate(value, type, options)
(tcomb-validation)
(3) createForm(type, options) -> ..form.. -> value
(tcomb-form, internally uses tcomb-validation)
The 3-uple (value, type, options)
is present in slightly different forms in all the APIs.
It would be nice to unify the treatment of options
. What if all the APIs had a types
option?
var bundles = {
'it_IT': {
Dat: {
i17n: {
// the function used to parse the strings coming from a form (optional)
parse: function (x) { ... },
// the function used to display a value in a form (optional)
format: function () { ... }
},
// the message to display when a validation fails, :actual is formatted with `format` (if specified)
':type': ':actual deve essere una data valida nel formato dd/mm/yyyy'
},
...
},
...
};
parse(value, Person, {types: bundles.it_IT});
validate(value, Person, {types: bundles.it_IT});
createForm(Person, {types: bundles.it_IT});
@grncdr I decided to apply the KISS principle and implement your proposal
Ah sorry, for not responding, I actually have a half-finished response in another tab!
Here's that half-finished response:
I like this proposal, in particular being able to define the per-type parse/format/message properties seems very desirable.
Where I'm not so sure is the separation of parse
, validate
, and the type constructor itself. What about giving parse
the signature (value, Type, options) -> Either[Type, ValidationError]
. That is, parse would return you a fully constructed Type
instance, or the error result from validate
.
Part of the reason for proposing this is that you could accomplish both jobs with a depth-first traversal of the type metadata. A (half-baked) algorithm:
input
is a scalar value or Type
is irreducible, call validate(input, Type, options)
,
Ok(Type(input))
.var properties = map(Type.meta, function (value, key) { return parse(input[key], Type.meta.props[key], options)
Type(properties)
a. combine errors from any failed validation results in properties
into a single result and return itIn addition to a simpler API, this approach should be more performant, since constructing the composite types will only perform a shallow traversal (since each instanceof
check will pass, the code would never descend again into a composite type).
What do you think? Too complex?
So we can merge parse
and validate
:
validation(value: Any, T: Type, [options: Obj]) -> Either(T, ValidationError)
and express the parsing and messages configuration in options
.
It's something I had in my mind after this:
https://github.com/gcanti/tcomb/issues/47#issuecomment-57654777
// options can be something like this:
options: {
types: {
Str: { // irriducible, only message
message: 'insert a string'
},
Dat: { // irriducible, message AND i17n
i17n: {
parse: function (x) {
return ...
},
format: function (date) {
return ...
}
},
message: 'insert a date with format ...'
},
Tags: { // not irriducible, it makes sense only to define a message
message: '...'
}
...
},
messages: {...}
}
// where
var Tags = list(Str, 'Tags');
EDIT: Maybe it makes sense to express a parse function also for Tags
: ex.
var tags = 'tag1,tag2';
function parse(x) {
return x.split(',');
}
:+1:
I could modify Result
to be backward compatible:
var Result = struct({
errors: maybe(list(Err)),
value: Any // <-- new field containing the instance if validation succeeded
}, 'Result');
Yes that would make sense to me, no need for a wrapper around a wrapper, and it's a nice type synonym for Either
with a more meaningful name (in the context).
Hi, I wrote a wiki page to state the various problems I'm facing. Writing down things helps me to reason about the refactoring.
I'm not sure I understand the motivation for trying to generalize transform
? I mean, it would probably be useful, but if it's necessary I didn't understand it.
Another small note on the signature of transform
:
var value = transform(json, [
{from: Str, to: Dat: parse: ..., format: ...},
{from: AddressJSON, to: Address: parse: ..., format: ...},
]);
In this case, would it try to transform every Str
to a Dat
? It seems ambiguous. Unless transform
is necessary to implement the other features, I would put it low on the priority list.
Anyways, I'm more interested in the "Validation" section... :smile:
It seems like you're planning to make the message formatting a separate function from validation, which makes a lot of sense to me!
Considering these signatures:
validate(data, T, options) -> Result(T, list(Error))
getMessage(list(Error), ?) -> ?
It seems like options
are only used for formatting messages, so maybe:
validate(data, T) -> Result(T, list(Error))
getMessage(list(Error), options) -> ?
Keeping the error results minimal also seems like a good idea. I'll see how far I can go with writing a formatter using just type
, actual
, and path
properties, it seems like types may want to include more information than just that, but I can't think of a specific case where type metadata wouldn't be sufficent.
P.S. - Is continuing to give feedback on this issue OK?
Ah, isn't that always how it goes, I realize the answer to my question right after hitting "submit" :smile:
I guess the options for validate would include the parse
for a given type? and this is where transform
starts to seem more appealing...
A little prototype formatter for error messages is here: https://gist.github.com/grncdr/cd3ebfe4d709481357ff
I'm not sure I understand the motivation for trying to generalize transform
I'm trying to modularize everything, splitting the logic of all the libraries in basic operations:
Thus transformations don't touch tcomb-validation at the moment.
In this case, would it try to transform every Str to a Dat?
It's just theory at the moment, but the idea would be to check the pair (Str, Dat)
(something like Scala's implicits), that is apply the transformation only if from = Str
and to = Dat
.
It seems like you're planning to make the message formatting a separate function from validation, which makes a lot of sense to me!
Yes! I refactored the whole codebase (2.0 branch). I'm really satisfied:
validate
returns an instance of type
validate
returns a partial value with all the succesful sub instancesIt seems like options are only used for formatting messages, so maybe:
Yeah. Yesterday I decided to keep completely separated the validation logic and the messages handling and suddenly everything made more sense. No more half backed DSL to handle messages. This is a more powerful combination, as you noticed:
validate(data, T) -> Result(T, list(Error))
getMessage(list(Error), options) -> ?
and it follows the Node.js best practice: "Write small modules that each do one thing"
P.S. - Is continuing to give feedback on this issue OK?
Absolutely! I love to work together.
A little prototype formatter for error messages is here:
:+1:
It's currently possible to customize the messaging for a particular property:
But this gets awkward when you re-use the same type in multiple places:
Instead of repeating the same message (or putting it into a variable and repeating that), It would be much DRYer to define messages for types/predicates instead of fields. This would also give the advantage that one can give more specific error messages:
What do you think?