axel-download-accelerator / axel

Lightweight CLI download accelerator
GNU General Public License v2.0
2.87k stars 258 forks source link

http: Fix UrlEncoding for UTF-8 #349

Closed makunterry closed 4 months ago

makunterry commented 3 years ago

A quick fix for UTF-8 url encoding support. Will fix issue: https://github.com/axel-download-accelerator/axel/issues/345

Signed-off-by: Kun Ma makunterry@163.com

makunterry commented 3 years ago

@ismaell Here is testing result, with the same URL from issue https://github.com/axel-download-accelerator/axel/issues/345.

image
ismaell commented 3 years ago

On 04/Apr/2021 06:58, kunm wrote:

@makunterry commented on this pull request.

@@ -389,18 +389,15 @@ decode_nibble(char n) return n - 'A' + 10; }

-inline static char -encode_nibble(char n) -{

  • return n > 9 ? n + 'a' - 10 : n + '0'; -}
  • +static unsigned char

A 16bytes table look up operation should be much more efficient than a ternary operation, generally.

Rarely so, in this case it's both slower and bigger.

I just compiled both for x86_64 GCC 10.2:

Not only the machine code takes about the same cache space, it adds a data load that takes substantial time; compared to that the branch is insignificant.

I doubt you can find any modern CPU where the memory is faster than the branching, and in those cases it would make more sense to produce the table from the compiler, like they do for jump tables.

makunterry commented 3 years ago

@ismaell Thanks for analysis. And I spent some time on the original code, found that if we use correct char type to do the encoding, everything will be fine. Please take a look.

ismaell commented 4 months ago

After much ruminating over the issue, I conclude there's no case where this is useful, URLs should be always pre-encoded, if anything the code is missing checks to reject malformed URLs.