nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.37k stars 323 forks source link

Python: bytearray body support for ASGI module. #1095

Closed andrey-zelenkov closed 7 months ago

andrey-zelenkov commented 8 months ago

Partially solves the problem: https://github.com/nginx/unit/issues/648

ac000 commented 8 months ago

Heh, Ok, so what's "Partial" about the fix? I.e what's missing?

andrey-zelenkov commented 8 months ago

Heh, Ok, so what's "Partial" about the fix? I.e what's missing?

Is it serious question? Or I need to change something in description? Sometimes I have problems with hints.

ac000 commented 8 months ago

On Fri, 26 Jan 2024 09:51:08 -0800 andrey-zelenkov @.***> wrote:

Heh, Ok, so what's "Partial" about the fix? I.e what's missing?

Is it serious question? Or I need to change something in description? Sometimes I have problems with hints.

Serious question. It says it's a partial fix, why is it only partial?

andrey-zelenkov commented 8 months ago

Serious question. It says it's a partial fix, why is it only partial?

Well, header of the metioned ticket in description is Python ASGI: support bytearray, memoryview in response body and this PR mentions only bytearray support. So, memoryview still needs to be implemented.

ac000 commented 8 months ago

Serious question. It says it's a partial fix, why is it only partial?

Well, header of the metioned ticket in description is Python ASGI: support bytearray, memoryview in response body and this

I'm only going by the commit message. That should give me the information I need without searching elsewhere, sure I can go elsewhere to get more details, but the commit message should contain all pertinent information and stand on its own.

PR mentions only bytearray support. So, memoryview still needs to be implemented.

Looks like that needs to go in the commit message.

andrey-zelenkov commented 8 months ago

I'm only going by the commit message. That should give me the information I need without searching elsewhere, sure I can go elsewhere to get more details, but the commit message should contain all pertinent information and stand on its own.

I agree. However, how does it not conflict with using expressions like "required by subsequent commits" in commit messages (as seen here https://github.com/nginx/unit/commit/52b334acd1a5678ee111d6db73dc02675c21a277 or here https://github.com/nginx/unit/pull/1094/commits/4927f5bf2f7e2a254d1962a3aef8b6134cb08244)? It explicitly requires searching for information somewhere to get more details.

Looks like that needs to go in the commit message.

Sure, will add.

ac000 commented 8 months ago

On Fri, 26 Jan 2024 15:48:45 -0800 andrey-zelenkov @.***> wrote:

I'm only going by the commit message. That should give me the information I need without searching elsewhere, sure I can go elsewhere to get more details, but the commit message should contain all pertinent information and stand on its own.

I agree. However, how does it not conflict with using expressions like "required by subsequent commits" in commit messages (as seen here https://github.com/nginx/unit/commit/52b334acd1a5678ee111d6db73dc02675c21a277 or here https://github.com/nginx/unit/pull/1094/commits/4927f5bf2f7e2a254d1962a3aef8b6134cb08244)? It explicitly requires searching for information somewhere to get more details.

I see no conflict.

Each commit should stand on its own, but all commits in a series should make sense as a whole.

In this PR's case there is only a single commit, so it needs to make sense on its own.

andrey-zelenkov commented 8 months ago

Each commit should stand on its own, but all commits in a series should make sense as a whole.

What do you call "series of commits" in a single branch repo history? And could you please provide me a git command that splits all commits within one branch into such "series of commits"?

ac000 commented 8 months ago

When you submit a pull-request, it consists of 1 or more commits, even GH clearly shows this.

But perhaps this will make it more clear. Taking my arm64 isolation fix PR as an example, if I was submitting that pull-request in an email based workflow for example, I would use the git-request-pull(1) command. E.g

$ git request-pull 02d1984c https://github.com/ac000/unit usermap-fix
The following changes since commit 02d1984c912261a1274534a24a2d94ac7c7abfa7:

  HTTP: Remove short read check in nxt_http_static_buf_completion() (2024-01-20 03:39:57 +0000)

are available in the Git repository at:

  https://github.com/ac000/unit usermap-fix

for you to fetch changes up to aa0912950e4f53c9cd570ada9fa47bc819f819d9:

  Configuration: Remove procmap validation code (2024-01-26 16:26:09 +0000)

----------------------------------------------------------------
Andrew Clayton (4):
      Isolation: Add a new nxt_cred_t type
      Isolation: Use an appropriate type for storing uid/gids
      Configuration: Use the NXT_CONF_VLDT_REQUIRED flag for procmap
      Configuration: Remove procmap validation code

 src/nxt_clone.c           |  6 ++--
 src/nxt_clone.h           |  8 +++--
 src/nxt_conf_validation.c | 76 ++++-------------------------------------------
 src/nxt_isolation.c       |  6 ++--
 4 files changed, 16 insertions(+), 80 deletions(-)

That pretty much gives you all the information you need (I would just flesh it out with some text describing it in more detail, like when you submit the PR on GH...)

But let's see. It tells us the point from which these changes start (02d1984c912261a1274534a24a2d94ac7c7abfa7), it clearly shows there are four commits in the pull-request and even shows a nice diffstat.

If you were to pull these changes to your local machine (and GH handily shows exactly how to do this) I.e

$ git checkout -b ac000-usermap-fix master
$ git pull git@github.com:ac000/unit.git usermap-fix

You can now see exactly what this looks like and use the various git commands to view it, e.g

$ git log -4 -p --reverse

However submitting a PR like that would generally only happen after the code in question has been reviewed. Before that, you would have submitted them as a patch series, again git has you covered.

$ git format-patch -4 --cover-letter
0000-cover-letter.patch
0001-Isolation-Add-a-new-nxt_cred_t-type.patch
0002-Isolation-Use-an-appropriate-type-for-storing-uid-gi.patch
0003-Configuration-Use-the-NXT_CONF_VLDT_REQUIRED-flag-fo.patch
0004-Configuration-Remove-procmap-validation-code.patch

So we have our patches and a cover letter, ready for emailing to the mailing list. I won't list all the patch contents here, but the cover letter basically has the information from the pull-request

$ cat 0000-cover-letter.patch 
From aa0912950e4f53c9cd570ada9fa47bc819f819d9 Mon Sep 17 00:00:00 2001
From: Andrew Clayton <a.clayton@nginx.com>
Date: Sat, 27 Jan 2024 13:34:35 +0000
Subject: [PATCH 0/4] *** SUBJECT HERE ***

*** BLURB HERE ***

Andrew Clayton (4):
  Isolation: Add a new nxt_cred_t type
  Isolation: Use an appropriate type for storing uid/gids
  Configuration: Use the NXT_CONF_VLDT_REQUIRED flag for procmap
  Configuration: Remove procmap validation code

 src/nxt_clone.c           |  6 ++--
 src/nxt_clone.h           |  8 +++--
 src/nxt_conf_validation.c | 76 +++------------------------------------
 src/nxt_isolation.c       |  6 ++--
 4 files changed, 16 insertions(+), 80 deletions(-)

-- 
2.43.0

Once you've edited the cover letter you are ready to email them, again git...

$ git send-email --to="Andrew Clayton <a.clayton@nginx.com>" 000*

Exact command used will vary... This will produce an email thread like

   1 O   Jan 27 Andrew Clayton  (  21) [PATCH 0/4] *** SUBJECT HERE ***
   2 O   Jan 27 Andrew Clayton  (  33) ├─>[PATCH 1/4] Isolation: Add a new nxt_cred_t type
   3 O   Jan 27 Andrew Clayton  ( 178) ├─>[PATCH 2/4] Isolation: Use an appropriate type for storing uid/gids
   4 O   Jan 27 Andrew Clayton  (  38) ├─>[PATCH 3/4] Configuration: Use the NXT_CONF_VLDT_REQUIRED flag for procmap
   5 O   Jan 27 Andrew Clayton  ( 122) └─>[PATCH 4/4] Configuration: Remove procmap validation code

Again, you can clearly see there are four patches in this series. They can be exported to an mbox file and applied to the repository with git-am(1) for fruther analysis/testing.

andrey-zelenkov commented 8 months ago

It all makes perfect sense and I appreciate your explanations. However, my question did not pertain to pull requests; it was related to the repository history:

"series of commits" in a single branch repo history

How can I determine the series to which a commit belongs in a scenario where the commit has already been pushed to the master branch (whether it occurred last week, last month, or last year)? To illustrate, consider the example I previously provided: https://github.com/nginx/unit/commit/52b334acd1a5678ee111d6db73dc02675c21a277.

Alternatively, let's assume that the pull request from your previous message was rebased and merged. How can I, in a few months after merge, reconstruct the series of commits by examining the master branch of the main repository?

ac000 commented 8 months ago

On Sun, 28 Jan 2024 04:30:29 -0800 andrey-zelenkov @.***> wrote:

It all makes perfect sense and I appreciate your explanations. However, my question did not pertain to pull requests; it was related to the repository history:

"series of commits" in a single branch repo history

How can I determine the series to which a commit belongs in a scenario where the commit has already been pushed to the master branch (whether it occurred last week, last month, or last year)? To illustrate, consider the example I previously provided: https://github.com/nginx/unit/commit/52b334acd1a5678ee111d6db73dc02675c21a277.

Alternatively, let's assume that the pull request from your previous message was rebased and merged. How can I, in a few months after merge, reconstruct the series of commits by examining the master branch of the main repository?

I think you're missing the point. For review your interested in the patch series... after it's merged you have the full history you can look through.

andrey-zelenkov commented 8 months ago

after it's merged you have the full history you can look through.

Sounds merciless towards the blame users! Which is quite strange when we're trying not to end up with a commit history mess at the same time.

ac000 commented 8 months ago

Sounds merciless towards the blame users!

I really don't know what you're on about now...

Which is quite strange when we're trying not to end up with a commit history mess at the same time.

Everything I've been saying is to this very end.

andrey-zelenkov commented 8 months ago

PR is rebased and commit message is updated:

 1:  c8a508d3 !  9:  69107e6a Python: bytearray body support for ASGI module.
    @@ Metadata
      ## Commit message ##
         Python: bytearray body support for ASGI module.

    -    Partially solves the problem:
    +    @filiphanes requested support for bytearray
    +    and memoryview in the request body here:
         https://github.com/nginx/unit/issues/648

    +    This patch implements bytearray body support only.
    +    Memoryview body still need to be implemented.
    +
      ## docs/changes.xml ##
     @@ docs/changes.xml: can be used as a unique request identifier.
      </para>
andrey-zelenkov commented 8 months ago

I really don't know what you're on about now...

Just saying that it becomes not easy to use blame when a major part of the information that comes from it is that a particular commit will be used somewhere in the future. And personally, I see a problem here.

Everything I've been saying is to this very end.

For now, my learning outcomes look like the following (correct me if I'm wrong): the concept of "commit series" with its "make sense as a whole" property only exists in the context of the pull request. Once the PR is merged (using rebase), there is no reliable way to restore the connectedness of the commit series with all its properties (including making sense as a whole) by examining the master branch of the main repository.

In my universe, this inevitably leads to a mess in the commit history, which is partially confirmed by the difficulty in using blame.

ac000 commented 8 months ago

Once it's been merged, when your looking through the commits, you'll see the thing being introduced, then within a few commits you'll see it being used... no problems there...

If you're getting hung up on that last line in the commit message, here it is again

commit 9919b50aecb196ff9e005ab3d13689bc011233a7
Author: Andrew Clayton <a.clayton@nginx.com>
Date:   Wed Jan 24 17:21:53 2024 +0000

    Isolation: Add a new nxt_cred_t type

    This is a generic type to represent a uid_t/gid_t on Linux when user
    namespaces are in use.

    Technically this only needs to be an unsigned int, but we make it an
    int64_t so we can make use of the existing NXT_CONF_MAP_INT64 type.

    This will be used in subsequent commits.

    Reviewed-by: Zhidao Hong <z.hong@f5.com>
    Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

You could remove that last line and the commit message is still completely valid and stands on its own, it's just adding a little extra context.

As for your blame scenario, again using the above commit as an example, if your blame points to that commit then you can do (as a simple thing, there of course other more advanced things you can do here)

$ git log --oneline --reverse 9919b50aecb196ff9e005ab3d13689bc011233a7~1..master
9919b50a Isolation: Add a new nxt_cred_t type
f7c9d3a8 Isolation: Use an appropriate type for storing uid/gids
eba7378d Configuration: Use the NXT_CONF_VLDT_REQUIRED flag for procmap
990fbe70 (upstream/master, master) Configuration: Remove procmap validation code

In practice you'd remove the --oneline option. In this case that is all the commits. When you have a lot more after that, it's not really a problem, you just read through the commits you're interested in anyway.

Another handy thing you can do is suppose you're interested in all the commits that either added or removed _nxt_credt

$ git log --oneline --reverse -S"nxt_cred_t" master
9919b50a Isolation: Add a new nxt_cred_t type
f7c9d3a8 Isolation: Use an appropriate type for storing uid/gids
andrey-zelenkov commented 8 months ago

Once it's been merged, when your looking through the commits, you'll see the thing being introduced, then within a few commits you'll see it being used... no problems there...

Yeah, that's more or less what I do, and that's why I used word 'reliable'. Since this approach is based on assumptions (i.e., a few commits away, it being used).

As for your blame scenario, again using the above commit as an example, if your blame points to that commit then you can do (as a simple thing, there of course other more advanced things you can do here)

Yes, I also perform similar exercises (appreciate your examples) from time to time, and every time I do, it feels like something is wrong here. Previously (for other nginx projects), I simply used blame without any additional fuss.

You could remove that last line and the commit message is still completely valid and stands on its own, it's just adding a little extra context.

It seems like we have a slightly different understanding here. The commit in this pull request makes sense to me starting from the very header line, as it contains the motivation for this change (supporting new body type in ASGI applications). And when you look through the diffs of this commit, it's even possible to find an example of how to use this feature. All other information in the commit message is extra for me (e.g., why it only partially fixes some issue, what the memoryview status is, etc) except the link to the issue requesting this feature which explaining motivation even more.

On the other hand, in your commit message above, I don't see any motivation; I can only see the adding of the new type with it's description (really good one but still just a description). The real purpose of this change only manifests itself in subsequent commits when you actually use it, so for me, the last line in this commit message is much more than just a little extra context. It's a major part of information that I am getting from this commit message.

ac000 commented 8 months ago

On Thu, 01 Feb 2024 01:39:07 -0800 andrey-zelenkov @.***> wrote:

On the other hand, in your commit message above, I don't see any motivation; I can only see the adding of the new type with it's description (really good one but still just a description). The real purpose of this change only manifests itself in subsequent commits when you actually use it, so for me, the last line in this commit message is much more than just a little extra context. It's a major part of information that I am getting from this commit message.

I could have added the new type in the same commit as it's used. But generally each commit should make one logical change.

Adding something and using it are two different things...

andrey-zelenkov commented 8 months ago

But generally each commit should make one logical change.

I'm pretty sure you and I have a different meaning to this.

Adding something and using it are two different things...

I would avoid using this argument; otherwise, even writing a function that returns the sum of two numbers would require multiple commits.

ac000 commented 7 months ago
@filiphanes requested support for bytearray
and memoryview in the request body here:
https://github.com/nginx/unit/issues/648

Generally speaking URLs should be enclosed in <>, it's just currently for the Closes: tag we don't.

andrey-zelenkov commented 7 months ago

Generally speaking URLs should be enclosed in <>, it's just currently for the Closes: tag we don't.

Adding it is no problem, I'm just curious: where do these practices come from?

andrey-zelenkov commented 7 months ago
% git range-diff 69107e6a...3688e340
-:  -------- > 1:  ecd57392 Configuration: Add nxt_conf_get_string_dup()
-:  -------- > 2:  bb376c68 Simplify, by calling nxt_conf_get_string_dup()
-:  -------- > 3:  46cef09f Configuration: Don't corrupt abstract socket names
-:  -------- > 4:  9e986704 Configuration: Fix validation of "processes"
1:  69107e6a ! 5:  3688e340 Python: bytearray body support for ASGI module.
    @@ Commit message

         @filiphanes requested support for bytearray
         and memoryview in the request body here:
    -    https://github.com/nginx/unit/issues/648
    +    <https://github.com/nginx/unit/issues/648>

         This patch implements bytearray body support only.
         Memoryview body still need to be implemented.