nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.65k stars 1.47k forks source link

`echo -0x80'i8` prints 128 (which is out of range) and other bugs #18422

Open timotheecour opened 3 years ago

timotheecour commented 3 years ago

Example

when defined case5d:# gitissue D20210703T101319:here
  echo -128'i8
    # GOOD: prints -128 in 1.5.1; parsed as: int8(-128)
    # was CT error in 1.4 which was parsed as: - (128'i8)
  echo 0x80'i8 # -128
  echo "bugs:"
  echo -0x80'i8 # BUG: prints 128 which is out of range for int8
  const a1 = -0x80'i8
  echo a1
  echo ($a1, a1, $type(a1))

  echo "ok:"
  let a2 = a1
  echo a2
  echo ($a2, a2, $type(a2))

  echo "ditto with other types:"
  const a = -0x8000'i16
  echo (a, $a, int16.low, int16.high)

ditto with -0o200'i8

Current Output

-128
-128
bugs:
128
128
("128", -128, "int8")
ok:
-128
("-128", -128, "int8")
ditto with other types:
(-32768, "32768", -32768, 32767)

Expected Output

-128 -128 bugs: CT error for -0x80'i8 and similar literals

Possible Solution

1.5.1 3ceaf5c1309ac8d4d68e6f39c13b021bcc1b15f4

proposal

make parser give CT error for literals (hex,oct,binary) with a minus sign that result in a >=0 number: these would become CT errors; they're error prone and serve no purpose:

  echo -0x80'i8
  echo -0x81'i8
  echo -0x8000'i16
  echo -0x8001'i16
  echo -0o200'i8
  echo -0o201'i8
  # etc
ringabout commented 3 years ago

Slightly related: https://github.com/nim-lang/Nim/issues/4350 ("The range of hexadecimal number")

What Go/Rust does is just like strtol function in C

package main

import (
    "fmt"
    "strconv"
)

func main() {
    j, err := strconv.ParseInt("-80", 16, 8)
    fmt.Println(j)
    fmt.Println(err)

    fmt.Println(int8(-0x80)) // 128
    fmt.Println(int8(0x80))  // overflow

    fmt.Println(-0x8000000000000000)
    fmt.Println(0x8000000000000000) // constant 9223372036854775808 overflows int
    fmt.Println(0x8000000000000001) // ./prog.go:18:14: constant 9223372036854775809 overflows int

}

Rust

pub fn main() {
    println!("num: {}", 0x80i8);
}

error message:

error: literal out of range for `i8`
 --> src/main.rs:2:25
  |
2 |     println!("num: {}", 0x80i8);
  |                         ^^^^^^ help: consider using the type `u8` instead: `0x80u8`
  |
  = note: `#[deny(overflowing_literals)]` on by default
  = note: the literal `0x80i8` (decimal `128`) does not fit into the type `i8` and will become `-128i8`

Maybe the lexer can work like strtol in C stdlib, follow that convention. Instead of making -0x80'i8 a CT error, let the lexer not interpret the high bit of hexadecimal as sign bit. Then the lexer of hexadecimal is more consistent with that of plain number.

Expected

0x7f'i8 => 127 -0x80'i8 => -128 0x80'i8 => overflow

Pros

Related

https://github.com/fmtlib/fmt/issues/235

strformat.fmt doesn't show sign bit too.

>>> import strformat
>>> fmt"{-128:x}"
-80
timotheecour commented 3 years ago

Here's how we can solve this without breaking code: {.unsignedLiterals: on.}

as suggested here: https://github.com/nim-lang/RFCs/issues/364#issuecomment-872773666