nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

Refinement types - Compile time (and real time) restrictions for values #541

Open ASVIEST opened 6 months ago

ASVIEST commented 6 months ago

Abstract

Add a refinement type that like concepts, but for values

type
  Even* = refinement x
    x mod 2 == 0

Motivation

it makes the type system more strict. This can prevent you from making stupid mistakes in your code. Also we can make value based overloading.

Description

For implementing it maybe can be need NIR.

In refinement type must be use small nim subset (using nim VM)

var ct {.compileTime.} = 10
type
  Type = refinement x
    const anyCode = ct == 10 # can be used any code at compileTime
    anyCode
    let pred = x > 4
    for i in 0..3:
      true
    pred

Refinement type body must be also checked for sat, if is sat, then it must gen Error, like: Refinement type Name dont constraint the type It not possible to prove all nim syntaxes at compileTime, using the SMT. Not proved at CT refinement it must generate hint like refinement not proved at compile time. sample proposal implementation

graph TD;
  TM[types match]
  IsCall{"is call ?"}
  HasCandidates{"has more 1 candidates (more then one  one refinement types can be matched)?"}
  Assert{"real time assert"}

  A[runtime checks off] --> TM
  B[runtime checks on] --> IsCall
  IsCall -->|true| HasCandidates
  IsCall -->|false| Assert
  HasCandidates --> |true| E["generate if's; when more then one refinement matches: real time error. That checks candidates expressions call. No matches also raises real time error"]
  HasCandidates --> |false| Assert --> TM

Compile time proving should use SMT solver like Z3.

Code Examples

type
  Even* = refinement x
    x mod 2 == 0

var x = 3
inc x
raiseAssert x is Even #it's must compiles because we know that x = 4

# =========================================
type
  Even* = refinement x
    x mod 2 == 0

var x = 3
for i in 0..0:
  inc x
raiseAssert x is Even #it can't be proved because loop unrolling is very hard, but is true

# =========================================
import std/algorithm
type
  Sorted = refinement x: openArray[T]
    isSorted(x)
proc binarySearch[T](a: Sorted[T]; key: T): int =
  impl

# =========================================
type
  AtomicNode= refinement x: Node
    x.kind in AtomicNodes

proc getParent(node: not AtomicNode): Node =
  impl
# if you try to getParent of not AtomicNode, you will have an error 
# (if it can't be known at CT, realtime overhead will same that using assert, also you will know that it have overhead via hint).

# =========================================
import std/strutils
type
  Password = refinement x: string
    x.len < 100
    UppercaseLetters in x

# =========================================
import pkg/regex
type
  EmailString = refinement x: string
    const pat = re2"[\w-\.]+@([\w-]+\.)+[\w-]{2,4}"
    x.match(pat)

Backwards Compatibility

No response

omentic commented 6 months ago

See also: #172 and specifically https://github.com/nim-lang/RFCs/issues/172#issuecomment-1604745482

ASVIEST commented 6 months ago

Now i think that for it we need special SMT Immediate representation, that generates from NIR, this IR have all that need on SMT analysis of code. For refirement type it in scope where it used and only before refirement type. Example of this IR

var x = "val" # x = StrVal("val")
var y = 4 # y = IntVal(4)
var z = 2.0 # z = FloatVal(2.0) #FP in z3

import std/rationals
var w = 5//7 # w = Rational(5, 7) # RealVal in z3
var f = 5'u32 // 8 # f = Rational(5, 8); f >= 0

var c = {5, 8} # c = Set({5, 8})

for i in 0..<5:
  y = 7

# Loop{
#  Range { # max and min iter count, can be Unknown, when min == max, loop can be unrolled. when min > 0 we can check invariance of loop body (it preferred than unrolling, because invariant loops becomes 1 expr). 
#    min: 5
#    max: 5
#  }
#  i = IntConst
#  i >= 0
#  i < 5
#  y = IntVal(7)
# }

var s = @["a", "b", "c"] # SeqVal(@["a", "b", "c"])
if (let x = readline(stdin); x) == "test":
  s = @[x]

# If { # can map into z3 if
#  entry {
#   x = StrVal("test")
#   s = SeqVal(@[x])
#  }
#  else {
#   # yes, all elif branches as new if here
#   x = StrConst
#   s = SeqVal(@["a", "b", "c"]) # when some var was modified in some branch, it's must be modified in other branches
#  }

try:
 raise newException(CatchableError, "err")
 y = 5 
except CatchableError:
  discard

# try body analysis for raise algorithm:
# make the smt predictions about try body
# haveErr = BoolVal(false)
# any raise just changed haveErr to BoolVal(true)
# if haveErr can be predicted, just select one predicted branch of try,

Result smtir:

# smtir
x = StrVal("val")
y = IntVal(4)
z = FloatVal(2.0)
w = Rational(5, 7)
f = Rational(5, 8)
f >= 0
var c = SetVal({5, 8})

Loop{
 Range { # max and min iter count, can be Unknown, when min == max, loop can be unrolled
   min: 5
   max: 5
 }
 i = IntConst
 i >= 0
 i < 5
 y = IntVal(7)
}

cond__0 = StrConst
If { # some if's can map into z3 if
cond_ 0 == "test"
entry {
  x = StrVal("test")
  s = SeqVal(@[x])
 }
 else {
  # yes, all elif branches as new if here
  x = StrConst
  s = SeqVal(@["a", "b", "c"]) # when some var was modified in some branch, it's must be modified in other branches
 }
}

haveErr = BoolConst
Try {
  haveErr # is defined after entry
  entry {
   haveErr = BoolVal(true) # goto else
  }
  else {
  }

Then when in refirement type it will simplified because save only need variables as example we have predicate (in module scope) that y > 0 (about y) smtir become

  1. y = IntVal(4)
    Loop{
    Range { # max and min iter count, can be Unknown, when min == max, loop can be unrolled
    min: 5
    max: 5
    }
    i = IntConst
    i >= 0
    i < 5
    y = IntVal(7)
    } # Loop invariant
  2. y = IntVal(4)
    y = IntVal(7)
  3. y = IntVal(7)

    For so simple cases we can do without SMT at all, but it rarely happens. y > 0 => true