lpereira / lwan

Experimental, scalable, high performance HTTP server
https://lwan.ws
GNU General Public License v2.0
5.94k stars 548 forks source link

12% of cycles spent percent-decoding URLs #351

Closed lpereira closed 1 year ago

lpereira commented 1 year ago

Replacing url_decode() with strlen() increases Lwan throughput by about 12% when servicing Hello, World responses. The current algorithm is not only very branchy, it also essentially does a byte-by-byte copy of every character that comes into the function, even if no decoding was necessary for that particular string.

jvoisin commented 1 year ago

a4fd3a6aaa1a733b28013ef9fcc7e320206632b9 improved the situation, but I'm sure that the two for loops could be mergedm so that the string doesn't have to be iterated on two times.

lpereira commented 1 year ago

The current version exchanges expensive byte-wise memory writes by fast byte scans+fast byte copies. It's a good tradeoff. Merging the two loops would increase the complexity quite a bit; it seems possible but I don't know yet how to strike a good performance/readability ratio. If you have an idea on how to do this, please send a patch!

lpereira commented 1 year ago

It should be looping through the string just once now. I haven't measured the performance yet.

On Wed, Jan 4, 2023, at 7:32 PM, Julien Voisin wrote:

a4fd3a6 improved the situation, but I'm sure that the two for loops could be mergedm so that the string doesn't have to be iterated on two times.

— Reply to this email directly, view it on GitHub https://github.com/lpereira/lwan/issues/351#issuecomment-1371335311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADVGJSDGYSUEG6LCG26O3WQXF6BANCNFSM6AAAAAASMX3K4U. You are receiving this because you authored the thread.Message ID: @.***>

jvoisin commented 1 year ago

The corresponding commit being f3a4785d5a6467ea20b404c86e2d0da30e93f8f8