refactor all handleHexChar uses

timotheecour / Nim

Nim is a compiled, garbage-collected systems programming language with a design that focuses on efficiency, expressiveness, and elegance (in that order of priority).

http://nim-lang.org/

Other

2 stars 0 forks source link

refactor all handleHexChar uses #430

Open timotheecour opened 3 years ago

timotheecour commented 3 years ago

refs https://github.com/nim-lang/Nim/pull/16209#issue-530400420

handleHexChar was moved to lib/std/private/decode_helpers.nim in https://github.com/nim-lang/Nim/pull/16209

[ ] compiler/lexer

proc handleHexChar(L: var Lexer, xi: var int; position: range[0..4])

[ ] compiler/rodutils
```
proc hexChar
```

[x] lib/pure/parsecfg

proc handleHexChar(c: var CfgParser, xi: var int)

[x] lib/pure/parsejson

proc handleHexChar(c: char, x: var int): bool

[x] lib/pure/parsesql

proc handleHexChar(c: var SqlLexer, xi: var int)

[x] lib/pure/pegs

proc handleHexChar(c: var PegLexer, xi: var int)

[x] nimsuggest/sexps

proc handleHexChar(c: char, x: var int): bool

[x] std/oids: proc hexbyte*(hex: char): int =
[ ] check if more?

/cc @xflywind

ringabout commented 3 years ago

std/parseutils

parseHex

  while i < last:
    case s[i]
    of '_': discard
    of '0'..'9':
      output = output shl 4 or T(ord(s[i]) - ord('0'))
      foundDigit = true
    of 'a'..'f':
      output = output shl 4 or T(ord(s[i]) - ord('a') + 10)
      foundDigit = true
    of 'A'..'F':
      output = output shl 4 or T(ord(s[i]) - ord('A') + 10)
      foundDigit = true
    else: break
    inc(i)

ringabout commented 3 years ago

std/parsexml

parseEntity

      while true:
        case my.buf[pos]
        of '0'..'9': r = (r shl 4) or (ord(my.buf[pos]) - ord('0'))
        of 'a'..'f': r = (r shl 4) or (ord(my.buf[pos]) - ord('a') + 10)
        of 'A'..'F': r = (r shl 4) or (ord(my.buf[pos]) - ord('A') + 10)
        else: break
        inc(pos)

timotheecour commented 3 years ago

@xflywind low priority but for handleHexChar what do you think of an approach based on precomputed LUT (lookup table) instead of case ... ?

ringabout commented 3 years ago

Yeah, it sounds interesting. If it is efficient enough, we could optimize them in the future.


case '0' .. '9', 'a' .. 'z', 'A'.. 'Z': r = (r shl 4) or LookupTable[char]
else: break

timotheecour commented 3 years ago

simpler:

const hexChars =  {'0' .. '9', 'a' .. 'z', 'A'.. 'Z'}
if c in hexChars: r = (r shl 4) or lookupTable[c]

the question is whether lookupTable and hexChars should be const or let, because const gets pasted, whereas let is addressable; we have to look at codegen / asm and benchmark (refs: https://github.com/nim-lang/RFCs/issues/257 {.addressable.})