keleshev / schema

Schema validation just got Pythonic
MIT License
2.88k stars 215 forks source link

Compatibility issue with SchemaErrors between Python 2 and 3. #113

Open bernalrs opened 8 years ago

bernalrs commented 8 years ago

Would appreciate anyone input on this, in a nutshell some Exceptions/Error are instantiated with error message contains u characters as per example below.

assert "Wrong keys 'pager'" in "Wrong keys u'pager' in {u'pager': 1, 'action': u'viewPages'}"

I wrote a test that demonstrate a compatibility issue between Python 2 and 3.
Full log: https://travis-ci.org/bernalrs/schema.

In python 3 this is not the case and I think that is the correct behaviour, it appears that this is cause by %r, Would replacing %r with %s be an acceptable solution in this case, let me know any suggestions on how to address this ?

skorokithakis commented 8 years ago

The only problem is that then you won't get the quotes, so it should be "Wrong keys '%s' in etc", i.e. the quotes should be in the string.

bernalrs commented 8 years ago

Agree we should leave the quotes, alternatively but I am not entirely convience because it is a hack we could always regex replace from u'.' into '.' ?

skorokithakis commented 8 years ago

Isn't that an even bigger hack?

bernalrs commented 8 years ago

indeed. but you get to keep the quotes it's like picking the bad from the worst.

skorokithakis commented 8 years ago

You do the other way too, and you don't end up replacing strings you shouldn't if the key is ['foo'].

bernalrs commented 8 years ago

didn't get your last comment say again ?

skorokithakis commented 8 years ago

Say, for example, that the key is "u'hi'" (that is, a string that contains u'hi'), your solution would turn it into "'hi'", messing with the string itself.

bernalrs commented 8 years ago

Well that is one interpretation and again I am not advocating for the suggested hack at all on the contrary.

Maybe a way to think about to is that u'hi' is a key 'hi' of type Unicode. If no Unicode is supply then we could raise something like:

  1. Expecting Unicode but got string..
  2. If Incorrect Unicode string is submitted then we could say wrong key 'hi2' in. {'hi2':1} where 'hi' is Unicode but represented as a string inside error messages only.

I believe this Error message should be useful to the user to correct data post validation with an eventual resubmitting after taking corrective action.

The u'hi' convention is quite particular to python 2, Furthermore if we were to use the library for validating restful call we could imagine returning the error message with that convention that would be meaningless to other programming languages. Nevertheless coming back to our issue would be good to get the same series of characters in python 2 and 3 with regards to error messages in SchemaErrors

skorokithakis commented 8 years ago

I think that the error string should be strictly aimed towards human consumption. I don't think we should go to great lengths to fix this, my opinion is that we should 1) make it look good to a human 2) consistent enough that tests pass.

I think changing to "Expected '%s' but etc" fulfils both those goals, so I'm not sure extra effort is justified. We certainly don't want to mandate that keys need to be unpythonically strongly typed (because "hi" == u"hi" -> True).

bernalrs commented 8 years ago

Just to clarify are you talking about the same code fragment or that we should not fix this issue ?

raise SchemaWrongKeyError('Wrong keys %s in %r' % (s_wrong_keys, data), e.format(data) if e else None)

Changing %r to %s ?

skorokithakis commented 8 years ago

The above code seems right to me, it wouldn't exhibit the issue you're describing:

>>> 'Wrong keys %s in %r' % (u"foo", {u"foo": u"bar"})
u"Wrong keys foo in {u'foo': u'bar'}"

Are you sure that's the fragment? %s wouldn't render to u"foo", only %r would.

bernalrs commented 8 years ago
$ ipython
Python 2.7.9 (default, Feb  2 2016, 14:47:31)
Type "copyright", "credits" or "license" for more information.

IPython 5.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: 'Wrong keys %s in %r' % (u'bad', {u'bad':1})
Out[1]: u"Wrong keys bad in {u'bad': 1}"

In [2]: 'Wrong keys %s in %s' % (u'bad', {u'bad':1})
Out[2]: u"Wrong keys bad in {u'bad': 1}"

it does for dictionaries with Unicode keys.

skorokithakis commented 8 years ago

Yes, but the issue above is the string, not the dictionary:

"Wrong keys 'pager'" in "Wrong keys u'pager' in

bernalrs commented 8 years ago

As per top post would have been convenient to remove all u characters from the error message including the one in dicts. However I see your point and if you think we have reach a logical conclusion we should close this issue: as will not fix.

Nevertheless, for the readers who might have encounter similar issues now that we have introduce SchemaMissingKeyError we can always catch the exception are reformat the error which is what I am doing now in my own code.

skorokithakis commented 8 years ago

But the issue is that it's a problem for tests if the first string sometimes shows a u and sometimes does not. Have you seen that in your tests? I don't know how that could happen, since the %s should never display a u...

bernalrs commented 8 years ago

If memory serves me right I think I have seen that issue as well may i ll try to put another test case together. Maybe it was a Unicode key wrapper in an Optional function as soon as I have some free time I will write the test case.

skorokithakis commented 8 years ago

Please do. It would be great if someone can replicate it so we can fix it.