nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.25k stars 322 forks source link

HTTP: Refactoring nxt_h1p_peer_header_send() #1214

Closed hongzhidao closed 2 months ago

hongzhidao commented 2 months ago

Previously, proxy request was constructed based on the r->target field. However, r->target will remain unchanged in the future, even in cases of URL rewriting. To accommodate this, I refactored the handling of the proxy target.

hongzhidao commented 2 months ago

Hi Andrew, For now, r->target is changed during URL rewriting. It will be constant because we need to keep $request_uri unchanged. Here's the change. https://github.com/nginx/unit/pull/1162/files#diff-8f22b5e7ac8aa80082722fd822e96917284058f7aa449aeffec406a5e0520beeL80 Let me know if it's necessary to add something like above to the commit log.

ac000 commented 2 months ago

For now, r->target is changed during URL rewriting. It will be constant because we need to keep $request_uri unchanged. Here's the change.

Yes, but I'm more asking why it remaining unchanged will be an issue for the proxy going forward.

hongzhidao commented 2 months ago

This doesn't really tell me why r->target remaining unchanged will be an issue. Yes, but I'm more asking why it remaining unchanged will be an issue for the proxy going forward.

There is no issue currently for proxy.

Ideally, the $request_uri will always correspond to r->target which happens in nxt_http_variables.c. During URL rewriting, r->target is changed, and the same is for $request_uri. We decided to make $request_uri constant, which means r->target will also be unchanged, even in case of URL rewriting. The logic of touching r->target will be removed. https://github.com/nginx/unit/pull/1162/files#diff-8f22b5e7ac8aa80082722fd822e96917284058f7aa449aeffec406a5e0520beeL80

To make it short, the r->target should be designed to be constant, but Unit needs to pass a changeable URL to upstream server in proxy module. So proxy can't depend on r->target.

About the quoted_target, take the following as an example.

client request: /blah%25blah?a=b
r->path: /blah%blah
r->target:/blah%2Fblah?a=b

quoted_target means it contains percent-encoding char like %25. If it's quoted, it means we need to encode r->path. And we also need to take into arguments account, if there is argument, we need to append it. The logic is something like this:

if (quoted_target || args->length > 0) {
    we need to generate a new URL.
}
ac000 commented 2 months ago

To make it short, the r->target should be designed to be constant, but Unit needs to pass a changeable URL to upstream server in proxy module. So proxy can't depend on r->target.

Right this is what I was after.

So, the proxy needs to see the re-written URL... gotcha.

That's a key bit of information missing from the commit message, please add it.

ac000 commented 2 months ago

About the quoted_target, take the following as an example.

client request: /blah%25blah?a=b r->path: /blah%blah r->target:/blah%2Fblah?a=b

quoted_target means it contains percent-encoding char like %25. If it's quoted, it means we need to encode r->path. And we also need to take into arguments account, if there is argument, we need to append it.

OK, so _quotedtarget means _percentencoded. It's just badly named.

ac000 commented 2 months ago

Hmm, this seems to be doing a lot more than just a refactoring. It seems to be mainly introducing new behaviour. If there is any refactoring in there, it should be done as a separate commit first.

If there isn't any actual refactoring going on, then the commit message should be updated to reflect that.

hongzhidao commented 2 months ago

I think it's just a refactoring patch, there are no functional changes introduced. When proxy was introduced, there is no rewrite feature at that time, it makes sense to use r->target which is equal to r->path + r->args. So what I did here is to use a more proper way of creating proxy sending header.

If there isn't any actual refactoring going on, then the commit message should be updated to reflect that.

Makes sense, added.

Btw, I'm thinking about separating quoted_target related code into a new commit.

ac000 commented 2 months ago

I think it's just a refactoring patch, there are no functional changes introduced.

But it's making use of the quoted_target thing which it wasn't before. Does that not equate to a change in behaviour?

hongzhidao commented 2 months ago

EDIT: This is in relation to HTTP: Introduce quoted target marker in HTTP parsing

Perhaps a bit of text in the commit message about why we need this now and we didn't need it before...

I don't know exactly why, because this part of the code was commented out, probably because it wasn't needed at the time. But I added a note about when it can be used, that's its intention.

hongzhidao commented 2 months ago

I think it's just a refactoring patch, there are no functional changes introduced.

But it's making use of the quoted_target thing which it wasn't before. Does that not equate to a change in behaviour?

I split it into two commits, for the first one, maybe it's a code change rather than refactoring. And I think it's a refactoring for the second commit.

ac000 commented 2 months ago

For me the second patch ((HTTP: Refactoring nxt_h1p_peer_header_send()](https://github.com/nginx/unit/pull/1214/commits/95727a1bae2224b1aa1a2c245583a1429eb14342)) is not a refactoring.

I would either split it up to do the refactor bit, i.e without the quoted_target stuff, then a second patch making use of the quoted_target stuff.

We didn't use quoted_target before, a refactor should not suddenly start using it I think...

Or just admit it isn't a refactor and adjust the commit message...

hongzhidao commented 2 months ago

Or just admit it isn't a refactor and adjust the commit message...

Ok, the same for the first one. Removed "No functional changes".