nim-lang / RFCs

A repository for your Nim proposals.
136 stars 26 forks source link

standardize all stringification API's around allocation-free binary $ operator + variadic strAppend #191

Open timotheecour opened 4 years ago

timotheecour commented 4 years ago

API's like $(a: MyType) (which are what we almost exclusively use in stdlib, nimble etc) do not compose efficiently because they create temporaries, allocate etc. instead, we should build API's around appending. This gives best of both worlds:

note

outplace operator is not relevant to this discussion, we don't want to use outplace operator for something so common like stringification

proposal part 1: binary $ operator

the API to stringify MyType shall be:

proc `$`(result: var string, a: MyType)

example

import system except `$`

## overloads of binary `$` for all types we need to support
proc `$`*(result: var string, a: float) = result.addFloat a
proc `$`*(result: var string, a: int) = result.addInt a
template `$`*(result: var string, a: string) = result.add a

template `$`*[T](a: T): string =
  ## the only unary `$` that's ever needed
  var result: string
  result$a
  result

proposal part 2: variadic strAppend

the following variadic strAppend is always preferable to & + $, and often preferable to varargs[string,$]

following macro strAppend advantageously replaces many patterns, yet is as efficient as can be. In particular, it allows writing a more efficient echo(a: varargs[string, $]) and write(a: File, args: varargs[string, $]), without introducing a bunch of temporaries.

import macros

macro strAppend*(dst: var string, args: varargs[untyped]): void =
  result = newStmtList()
  for ai in args:
    result.add quote do:
      `dst`$`ai`

macro strAppend*(args: varargs[untyped]): string =
  result = newStmtList()
  let dst = genSym(nskVar, "dst")
  result.add quote do:
    var `dst`: string
  for ai in args:
    result.add quote do:
      `dst`$`ai`
  result.add quote do:
    `dst`

these advantageously replace varargs[string,$] versions of system.echo, system.write:

template echo*(args: varargs[untyped]) =
  ## more efficient than system.echo: no temporaries
  ## could be further optimized using a fixed size buffer when short enough or even via a single threadvar string to do all the appending
  system.write(stdout, strAppend(args, "\n"))

template write*(f: File, a: varargs[untyped]) =
  ## more efficient than `write*(f: File, a: varargs[string, `$`]` in system/io
  system.write(f, strAppend(a))

usage

proc main()=
  var x = 0.3
  var ret: string
  ret$x
  ret$" bar "
  ret$12
  doAssert ret == "0.3 bar 12"
  doAssert $0.3 == "0.3"

  var ret2: string
  ret2.strAppend 0.1, " foo ", 12
  doAssert ret2 == "0.1 foo 12"

  doAssert strAppend(0.1, " foo ", 12) == "0.1 foo 12"
  echo 0.1, " foo ", 12 # prints `0.1 foo 12`

  stdout.write 0.1, " to_stdout ", 12, "\n" # prints 0.1 to_stdout 12

main()

note

I have a WIP that would allow writing strAppend as a template instead of a macro, to avoid depending on macros.nim; in particular that means that strAppend and echo thus defined could be in system.nim (or alternatively for system.nim minimalists, hiding strAppend there, using it to redefine echo, and redefining strAppend in some other module, but IMO it just belongs in system since its ubiquitous, replacing "foo" & $bar both performance wise and syntactically, using less operator noise (unlike both using & and $)).

the way it works is it allows templates to iterate over a varargs.

links

krux02 commented 4 years ago

Yes, I agree. I want this. And we almost already have it. Please take a look at strformat.formatValue. It even allows an extendable format specifier and does not have intermediate strings (unless of course the inevitable $ fallback triggers secretly).

Araq commented 4 years ago

Super nice RFC, one minor correction, strAppend should be called concat IMHO.

Clyybber commented 4 years ago

@disruptek had the awesome idea to transform x = a & b into a.add b; x = a/a &= b; x = a. EDIT: A bit simpler: x = (a &= b; a) We can do that whenever a isn't used afterwards (even better when b isn't used afterwards either, then we can sink it). Maybe this can be accomplished using term-rewriting macros if we give them a way to use last read/last use information. EDIT: This wouldn't get rid of all temporaries like this RFC. One theoretical disadvantage of this RFC (aside from the API complications) is that inplace calls can't be parallellized.

Varriount commented 4 years ago

I generally agree with this. I'm a bit hesitant about the new usage of the $ operator, but it makes sense. EDIT: See comment below.

Araq commented 4 years ago

It was noticed on IRC that we already have the binary append operator and it's written as &=. It doesn't do custom tostringifcations though. In general people dislike new operators so maybe we should stick to proc toString(res: var string; x: CustomType) which is currently unused and so doesn't conflict with anything.

Varriount commented 4 years ago

Perhaps "write" or "writeString" would be better?

The problem with using $ as an in-place binary operator is that it isn't consistent with other in-place operators (&=, +=, etc.). Even using $= isn't perfect, because all existing in-place operators are derived from binary operators or functions (add and +, respectively).

krux02 commented 4 years ago

I have a WIP that would allow writing strAppend as a template instead of a macro, to avoid depending on macros.nim

Well, what I have in mind for quite some time now is the ability to forward varargs in a template without importing macros. That would also be really helpful to implement bitops without macros. And bitops without macros would mean it could be used in sets.nim.

## more efficient than system.echo: no temporaries

That is not correct. There is still one temporary per call to echo. In https://github.com/nim-lang/Nim/pull/13277 I have a concept that really has no temporary.

github-actions[bot] commented 10 months ago

This RFC is stale because it has been open for 1095 days with no activity. Contribute a fix or comment on the issue, or it will be closed in 30 days.