Temptationx / modwsgi

Automatically exported from code.google.com/p/modwsgi
0 stars 0 forks source link

mod_wsgi changes value of HTTPS environment variable #222

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
Setup HTTPS server, execute probe script [1] to dump environment. See the link 
below for other details.

What is the expected output? What do you see instead?
Expected that HTTPS value is 'on' for all files served by Apache2. But Python 
script receive '1' string value instead.

1. https://issues.apache.org/bugzilla/show_bug.cgi?id=50581

Original issue reported on code.google.com by techtonik@gmail.com on 14 Jan 2011 at 2:24

GoogleCodeExporter commented 9 years ago
In fact I expected all environment variables to be immutable by mod_wsgi.

Original comment by techtonik@gmail.com on 14 Jan 2011 at 2:25

GoogleCodeExporter commented 9 years ago
Apache 2.2.14, Python 2.5.2, mod_wsgi 2.8. From Debian Lenny backports.

Original comment by techtonik@gmail.com on 14 Jan 2011 at 2:26

GoogleCodeExporter commented 9 years ago
Compliant WSGI scripts should not be relying up the HTTPS variable even being 
set in the WSGI environment dictionary. They are required instead to use 
wsgi.url_scheme variable as described in the WSGI PEP (333 & 3333).

So, the better solution is to remove the workaround in mod_wsgi which sets 
HTTPS variable in the first place so that some Python web applications would 
work. That that may then break non compliant WSGI applications is the WSGI 
applications problem if they haven't been updated to use wsgi.url_scheme.

Most code always checks for both HTTPS, if set, being either '1' or 'on' as 
what it is set to is not consistent across WSGI servers, or didn't used to be.

You reporting this on Apache bugzilla was wrong as got nothing to do with 
Apache and your description will likely just have confused people who were 
likely expecting that you were talking about CGI, which is different to WSGI.

Original comment by Graham.Dumpleton@gmail.com on 15 Jan 2011 at 1:21

GoogleCodeExporter commented 9 years ago
Should also be highlighted that mod_wsgi is NOT changing the value of an 
environment variable. Ie., it isn't touching os.environ as it does not push any 
WSGI environment variables into process environment, instead they are retained 
as per request variables only. If you are using a Python web framework which 
itself pushes WSGI per request variables into the set of process environment 
variables, it is arguably broken. 

So, the title for the issue is wrong anyway as mod_wsgi doesn't touch process 
environment variables.

Original comment by Graham.Dumpleton@gmail.com on 15 Jan 2011 at 1:24

GoogleCodeExporter commented 9 years ago
FWIW, the WSGI specification also in its CGI/WSGI bridge example has:

    if environ.get('HTTPS', 'off') in ('on', '1'):
        environ['wsgi.url_scheme'] = 'https'
    else:
        environ['wsgi.url_scheme'] = 'http'

Thus it recognises that different CGI hosting systems set HTTPS differently.

As I said, you should be using wsgi.url_scheme instead.

Original comment by Graham.Dumpleton@gmail.com on 15 Jan 2011 at 4:17

GoogleCodeExporter commented 9 years ago
It doesn't seem right to me that WSGI reinvents things that are already 
considered to be an established practice with other web hosting approaches (CGI 
and FastCGI). This requires an additional compatibility layer from scripts that 
should be run under CGI/FastCGI and WSGI depending on what server is available. 
mod_wsgi overly complicates setup and makes it almost equally complex as Java 
(you need to know way too much details just to get your script running 
properly).

As for '1' value for HTTPS I'd prefer to see some references to know at least 
one popular web-server that returns '1' instead of 'on' 
http://bugs.python.org/issue10906

And I still don't like the fact that mod_wsgi is unable to inherit environment 
"as is" from the server it is running in. I also unable to debug mod_wsgi 
problems that I have with Trac running from HTTPS connection
https://groups.google.com/d/topic/modwsgi/c8VQBAHm39Y/discussion

You're right that my report on Apache confuses people, but this report is 
technically full and correct from all sides. I couldn't get apriori knowledge 
of how mod_wsgi works and couldn't image that Apache module will make 
modifications in environment variables that belong to domain area of different 
module. If mod_wsgi doesn't provide security layer, it shouldn't mess with 
security variables.

I don't (want to) know the gory details of how process vs non-process 
environment variables work. Apache access.log shows that mod_wsgi DOES change 
HTTPS environment variable. Thanks for details, but as a web developer I never 
made a distinction between process, system and some other environment variable, 
and I am not happy to know that from now on I probably should.

Original comment by techtonik@gmail.com on 16 Jan 2011 at 8:05

GoogleCodeExporter commented 9 years ago
In respect of your comment:

"""It doesn't seem right to me that WSGI reinvents things that are already 
considered to be an established practice with other web hosting approaches (CGI 
and FastCGI). This requires an additional compatibility layer from scripts that 
should be run under CGI/FastCGI and WSGI depending on what server is 
available."""

A standard is a standard and it you don't like the WSGI specification then 
don't use it and use raw CGI instead.

For your reference the WSGI specification is documented at:

  http://www.python.org/dev/peps/pep-3333/

If you have problems with the specification and don't like that the only 
authoritative indicator of whether HTTPS is used is wsgi.url.scheme, and that 
this differs to CGI, then go complain on the Python WEB-SIG mailing list. The 
people who wrote the specification read the list and no doubt can tell you why 
wsgi.url_scheme is used rather than HTTPS.

Just be aware that WSGI is agnostic to the underlying web hosting mechanism and 
more often than not operates on an underlying system that is not CGI where 
HTTPS variable doesn't even exist. You are getting fixated on CGI as the only 
system when WSGI can be hosted on many differing things.

Also be aware that mod_wsgi is merely an implementation of WSGI and mod_wsgi 
didn't dictate what it was. There are many other WSGI implementations including 
WSGI adapters for CGI, SCGI, FASTCGI and AJP. In all those cases, any WSGI 
compliant application is meant to check wsgi.url_scheme and what variables any 
of those underlying protocols may define is completely irrelevant and if they 
are even supplied amongst what is in the WSGI environment is informational only 
and shouldn't be relied upon. If you do rely upon CGI variables outside of the 
core set which overlap with those which have meaning in WSGI, then your WSGI 
application will be non portable across WSGI servers.

As to your comment:

"""And I still don't like the fact that mod_wsgi is unable to inherit 
environment "as is" from the server it is running in."""

In mod_wsgi it does NOT inherit any environment from anything and your 
complaint that it doesn't retain that environment 'as is' has no basis.

You also say:

""" don't (want to) know the gory details of how process vs non-process 
environment variables work."""

If you did understand and if you knew the difference between process 
environment variables and the distinct WSGI environment variable and what 
Apache actually passes to mod_wsgi, you would see why your complaint is flawed.

Go write a WSGI script which prints out 'os.environ', which is the process 
environment inherited from Apache and you will see there is no CGI variables in 
there and this is because it isn't CGI. The WSGI environment dictionary which 
you are complain about is constructed by mod_wsgi to satisfy the requirements 
of the WSGI specification. It is NOT a mutation of what Apache gives it.

As to your Trac problem, you provided inadequate information at the time. You 
didn't state which version of mod_wsgi you were using. You didn't state which 
version of Apache you were using. Some versions of Apache have subtle problems 
with SSL connections which likely could have been your problem. Specifically, 
the SSL connection is handled by Apache and not directly by mod_wsgi. So, if 
there is an issue, it is going to be Apache and/or your network. The Chrome 
browser has also over time had various problems, especially if you track head 
and not use stable versions, and just because Chrome fails doesn't mean it is a 
server side problem. Demonstrate the problem on various browsers and then it 
might have some weight.

I was also on holidays at the time. Overall, because you provided inadequate 
information so as to give guidance and didn't follow up later, you didn't get 
an answer.

Original comment by Graham.Dumpleton@gmail.com on 16 Jan 2011 at 9:45

GoogleCodeExporter commented 9 years ago
Final action taken in response to this is that from mod_wsgi 4.0 the HTTPS 
variable will never be passed through to the WSGI environment even if set by 
SetEnv, SetEnvIf or a rewrite rule. Where is set in the Apache configuration 
manually, that value will be used to override the authoritative WSGI variable 
wsgi.url_scheme. The HTTPS variable will then not be passed through to the WSGI 
environment to stop use/abuse of the HTTPS variable by non conformant WSGI 
applications.

For more information see:

  http://code.google.com/p/modwsgi/wiki/ChangesInVersion0400

Original comment by Graham.Dumpleton@gmail.com on 16 Jan 2011 at 10:10