Pylons / webtest

Wraps any WSGI application and makes it easy to send test requests to that application, without starting up an HTTP server.
https://docs.pylonsproject.org/projects/webtest/en/latest/
Other
335 stars 109 forks source link

Subtypes of string passed as the webtest user are not handled in check_environ #158

Closed tolomea closed 5 years ago

tolomea commented 8 years ago

This came up when a section of test code resulted in a Django safe string being passed in as the username. Webtest check_environ checks explicitly for an exact type match not a subtype match. The relevant webtest lines are:

METADATA_TYPE = PY3 and (str, binary_type) or (str,)

assert type(environ[key]) in METADATA_TYPE, (
        "Environmental variable %s is not a string: %r (value: %r)"
        % (key, type(environ[key]), environ[key]))

Is there a reason why this isn't assert isinstance(type(eviron[key]), six.string_types) ?

Also this only came up on our Py3 build, digging into that led me to compat.py in django-webtest which Py2 effectively casts the safe string to a regular string:

def to_string(s):
    return str(s)

But on Py3 leaves it as a safe string:

def to_string(s):
    if isinstance(s, str):
        return s
    return str(s, 'latin1')

In general I'm not at all sure what that compat file is doing or why.

rooterkyberian commented 5 years ago

I have run into similar issue where I started preparing my project for python 3 transition but from __future__ import unicode_literals breaks my tests since unicode is invalid Environmental variable per current check.

digitalresistor commented 5 years ago

The environment is specified in PEP0333 and PEP3333, it may NOT contain any non str implementation, specifically:

On Python platforms where the str or StringType type is in fact Unicode-based (e.g. Jython, IronPython, Python 3000, etc.), all "strings" referred to in this specification must contain only code points representable in ISO-8859-1 encoding (\u0000 through \u00FF, inclusive). It is a fatal error for an application to supply strings containing any other Unicode character or code point. Similarly, servers and gateways must not supply strings to an application containing any other Unicode characters.

Again, all strings referred to in this specification must be of type str or StringType, and must not be of type unicode or UnicodeType. And, even if a given platform allows for more than 8 bits per character in str/StringType objects, only the lower 8 bits may be used, for any value referred to in this specification as a "string".

This basically limits it to latin-1. It is NOT correct to use unicode in the WSGI environment, nor is it valid to return unicode strings as header values, they too are defined as str with the above property.

So the webtest check above is correct in that the environment is not allowed to contain unicode. It could go even further and validate that it only contains code points in latin-1.

gawel commented 5 years ago

I think it's true only for non dotted key. (wsgi.error is a stream). But yeah, I think the current implementation is correct too.