lpereira / lwan

Experimental, scalable, high performance HTTP server
https://lwan.ws
GNU General Public License v2.0
5.94k stars 548 forks source link

Implement arm64 assambly coroutine #356

Closed SPFishcool closed 1 year ago

SPFishcool commented 1 year ago

I implemented a coroutine for arm64 architecture using assembly language.

SPFishcool commented 1 year ago

This is my testing/evaluation environments:

$ neofetch --stdout
OS: Arch Linux ARM aarch64 
Host: Apple MacBook Air (13-inch, M2, 2022) 
Kernel: 6.2.0-asahi-11-1-edge-ARCH 

$ lscpu
Architecture:           aarch64
  CPU op-mode(s):       64-bit
  Byte Order:           Little Endian
CPU(s):                 8
  On-line CPU(s) list:  0-7

I compared performance of arm64 coro and libucontext on heap overhead using weighttp and massif.

The following is massif and pressure testing command:

valgrind --tool=massif ./lwan
weighttp -n 100000 -c 100 -t 4 -k localhost:8080

This is result of libucontext:

libucontext

and result of arm64 coroutine:

arm64
lpereira commented 1 year ago

Thanks a lot for the PR! I was wondering how Lwan would work on Apple Silicon for a while, but I don't have access to one

While I'm inclined to accept this, can you help me understand why the memory usage is lower with your patch? Did you notice throughput differences as well?

(If you noticed throughput differences it might be a good idea to send a patch to libucontext too.)

On Mon, Jun 19, 2023, at 12:04 AM, SPFishcool wrote:

This is my testing/evaluation environments:

`$ neofetch --stdout OS: Arch Linux ARM aarch64 Host: Apple MacBook Air (13-inch, M2, 2022) Kernel: 6.2.0-asahi-11-1-edge-ARCH

$ lscpu Architecture: aarch64 CPU op-mode(s): 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7

I compared performance of arm64 coro and libucontext on heap overhead usingweighttpandmassif`.

The following is massif and pressure testing command:

valgrind --tool=massif ./lwan weighttp -n 100000 -c 100 -t 4 -k localhost:8080 This is result of libucontext: libucontext https://user-images.githubusercontent.com/102156500/246747331-fde3b767-c579-4844-b2e2-6d0f1e255147.png

and result of arm64 coroutine: arm64 https://user-images.githubusercontent.com/102156500/246747491-317301ca-4256-4edb-af8d-3517441a78f4.png

— Reply to this email directly, view it on GitHub https://github.com/lpereira/lwan/pull/356#issuecomment-1596623675, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADVGMUWTGX4MKIVRQM6R3XL72WLANCNFSM6AAAAAAZLOBBVY. You are receiving this because you are subscribed to this thread.Message ID: @.***>

SPFishcool commented 1 year ago

I just refered x86-64 coro code style in lwan and minicoro when I implemented it. this way make the coro_context struct smaller. the heap overhead for coro also become smaller.

lpereira commented 1 year ago

Merged, thank you!