aaaaaa123456789 commented 7 years ago

I'm sure that this has been mentioned before, but I've been coming up with a full feature proposal and I'd like to know how feasible it is. What follows is a proposed way of implementing functions, at least in a way I'd consider usable.

The idea is to allow the user to define functions, that would be evaluated in-line when called, similar to the various built-ins. Functions can have one or more arguments, which define variables that are local to the function; they are substituted in the expression. An example of the syntax I have in mind would be:

;definition
addnums(first, second) = ({first}) + ({second})

;usage
  ld hl, wTwoPlusTwo
  ld [hl], addnums(2, 2)

Functions can be defined with the same name as long as they have a different number of arguments; the proper version of the function would be used when called. For simplicity, all arguments are required.

average(one) = {one}
average(one, two) = (({one}) + ({two})) / 2
average(one, two, three) = (({one}) + ({two}) + ({three})) / 3

Declaring functions with the same name and number of arguments is not allowed; such declarations would either replace previous ones or cause an error (either works).

Calling built-in functions should obviously be allowed from user-defined functions:

linearaddress(label) = (BANK({label}) - 1) * $4000 + ({label})

In order to simplify functions like the average one above, varargs functions could be defined. Such functions would be used when there is no suitable non-varargs function (i.e., with the correct number of arguments) to call. For instance:

countargs() = 0
countargs(arg, ...) = 1 + countargs({...})

sum() = 0
sum(arg, ...) = ({arg}) + sum({...})

average(...) = sum({...}) / countargs({...})
roundedavg(...) = (2 * sum({...}) + 1) / (2 * countargs({...}))

The special symbol {...} is replaced by the list of variable arguments. Defining multiple varargs functions with the same name is not allowed; however, they can coexist with non-varargs functions.

As a silly example just to show how they would coexist:

fn(a) = {a}
fn(a, b, c) = {c}
fn(a, b, ...) = {b}

  db fn(4) ;4
  db fn(5, 9) ;9
  db fn(2, 6, 3) ;3
  db fn(1, 8, 0, 7) ;8
  ;db fn() is an error because there's no version that takes 0 arguments

I hope that this is doable, as it would certainly reduce some repetition and extremely macro-heavy code in some cases. As a simple realistic example, take this macro from Prism (might be slightly different because I'm typing it from memory):

;definition
coord: MACRO
  if _NARG > 3
    ld \1, (\4) + (\2) + ((\3) * SCREEN_WIDTH)
  else
    ld \1, wTilemap + (\2) + ((\3) * SCREEN_WIDTH)
  endc
ENDM

hlcoord EQUS "coord hl, "

;usage
  hlcoord 2, 5 ;sets hl to the corresponding tilemap position

Which using functions would be a lot cleaner:

;definition
coord(x, y, base) = ({base}) + ({x}) + ({y}) * SCREEN_WIDTH
coord(x, y) = coord({x}, {y}, wTilemap)

;usage
  ld hl, coord(2, 5)

I await your responses; thanks for reading.

BenHetherington commented 7 years ago

Interesting! So, if I'm right in saying, the main things that distinguish this from existing macros are:

The ability to use them as part of an expression (like EQU/EQUS/SETs all can be)
Named parameters
Easier overloading (based on number of parameters)

I do like that you're going for a functional-style feel, but it bothers me that it's a very different syntax for something that's intuitively similar to macros.

Since I think the latter two features would be also good for macros to have, we could potentially unify their syntaxes, perhaps in a manner similar to this:

MyFunction: FUNCTION(foo, bar)   ; Reflects usage as MyFunction(foo, bar)
    \foo + \bar
    ENDF

MyMacro: MACRO foo, bar          ; Reflects usage as MyMacro foo, bar
    ld a, \foo + \bar            ; Similar to \1, \2
    ENDM

The syntax to use them would probably still have to differ, due to the existing difference between macros and built-in functions.

This approach makes functions look procedural, so I don't know if we should do so (and add a RETURN statement, and possibly make that implicit for the last line), or only allow a single expression (which would make this rather verbose for a one-liner).

Alternately, you could think of it as something more similar to EQU or SET. Then the MyFunction() = ... syntax could be a nicer way of doing MyFunction() FUNC ... (similar to SET).

But, in order to add a little consistency, I'd consider using \foo to access parameters, since it's closer to how you already access parameters in macros:

MyFunction(foo, bar) = \foo + \bar

You can probably tell that I'm just thinking around the issue, and haven't really made any decisions on what I like best. However we implement it though, I do think this would be a useful feature.

Thoughts?

AntonioND commented 7 years ago

I prefer the {} version because they are less ambiguous: "\foo1" vs "{foo}1" (see https://github.com/rednex/rgbds/issues/63 ).

In principle it doesn't sound bad. In practice, this is a lot of work. The code that handles macros is quite annoying because it simply cannot be "just parsed", you have to inject the expanded code of the macro...

EDIT: Also, more things like macros, equs, etc would make this even worse: https://github.com/rednex/rgbds/issues/64

aaaaaa123456789 commented 7 years ago

I'd say the functional style is simpler, although I can imagine a few cases in which procedural would be more powerful. That being said, procedural-style functions would require you to actually execute them, so that might increase the burden on your side.

I'm also assuming lazy evaluation here (see the linearaddress example above, which wouldn't work without it because BANK($4444) is invalid), which might be harder to achieve in a procedural way.

That being said, I agree that named parameters and overloads (and even varargs) could be useful for macros, so a unified syntax sounds like a good idea. The function/endf syntax looks a bit long for functional-style one-liners, but the "= replaces func" syntax looks good. I'm not really biased towards any particular syntax, though.

64 and the numerical lexer abuse in #63 are downright silly, and the assembler should probably just fail and exit if it detects weird behavior like that with functions. I can't see something like that appearing in real code.

By the way, I'm well aware that this is a big feature, so I don't expect it to be finished soon.

yenatch commented 6 years ago

Can named args be used on call as well? And can it be multiline?

    MyFunction (
        foo = 2,
        bar = 1
    )

ISSOtm commented 4 years ago

One of the things for sure: such functions will be "pure". No side effects, only produce an expression. I would like avoiding the approach macros currently take (re-lexing+parsing a buffer), since that's pretty slow. This brought up a problem we have at all: how do captures work?

CONSTANT = 0
DEF foo() => CONSTANT
CONSTANT = 1
db foo()

Should this:

Output 0
Output 1
Be an error ?

aaaaaa123456789 commented 4 years ago

I don't think it should be an error. As for 0 or 1, both approaches are valid, but I'd say 0 is more useful, but 1 is more natural to people used to macros — I'd prefer 0. ~(or make it configurable)~

ISSOtm commented 4 years ago

If so, I'll go with 0 for a first implementation, and maybe extend with a "capture" syntax later so 1 can be produced.

meithecatte commented 4 years ago

If I'm understanding correctly, the behavior "output 1" could create dependency cycles...

ISSOtm commented 4 years ago

No, because the right-hand is always evaluated before the left-hand. The behavior that's been decided on is "output 0", and a syntax akin to C++'s captures may be added in the future to allow the possibility for "output 1".

pinobatch commented 4 years ago

For comparison, .define macros in ca65 behave much like #define macros in C preprocessor, except they don't use parentheses when called.

ISSOtm commented 3 years ago

Making these work in a useful way would require #619, if only because recursive functions would require the same kind of expression system rewrite that short-circuiting operators would. Therefore, and because I think 0.4.2 has been delayed enough, this feature will be postponed (again) to 0.4.3.

ISSOtm commented 3 years ago

Lazy evaluation of symbols is important. Eager evaluation can be forced using naked braces (since #634), but lazy evaluation cannot otherwise be done.

Rangi42 commented 3 years ago

Sjasm has something similar: "text macros with arguments". (Although these are expanded as statements, not expressions, so more like macros with named parameters.)

Rangi42 commented 3 years ago

Some implementations for common math functions that aren't built into rgbasm:

def abs(x) = x < 0 ? -x : x
def abs(x) = x * (1 - 2 * (x < 0))
def abs(x) = x * sgn(x)

def sgn(x) = x < 0 ? -1 : x > 0 ? 1 : 0
def sgn(x) = (x > 0) - (x < 0)
def sgn(x) = x ? x / abs(x) : 0

def sqrt(x) = pow(x, 0.5)

Rangi42 commented 3 years ago

Functions could allow specifying their arguments by name, in case the given order doesn't match the declaration order.

def coord(x, y) = x * SCREEN_WIDTH + y

    ld hl, wTilemap + coord(y=5, x=10)
    ld hl, wTilemap + coord(10, 5)

ISSOtm commented 3 years ago

This would be especially helpful to clarify intent, as it's neither obvious nor widely agreed upon whether X or Y goes first (typical usage is X, Y, but e.g. OAM uses Y, X)

Rangi42 commented 2 years ago

Various useful functions:

;;; Mathematical functions
; https://en.cppreference.com/w/c/numeric/math

DEF abs(x) := x < 0 ? -x : x
DEF abs(x) := x & $7fff_ffff
DEF abs(x) := x * sgn(x)

DEF sgn(x) := x > 0 ? 1 : x < 0 ? -1 : 0
DEF sgn(x) := (x > 0) - (x < 0)
DEF sgn(x) := -(x < 0)
DEF sgn(x) := x ? x / abs(x) : 0
DEF sgn(x) := x ? abs(x) / x : 0

DEF min(x, y) := x < y ? x : y
DEF min(x, y) := y ^ ((x ^ y) & -(x < y))
DEF max(x, y) := x > y ? x : y
DEF max(x, y) := x ^ ((x ^ y) & -(x < y))

DEF clamp(v, x, y) := v < x ? x : v > y ? y : v
DEF clamp(v, x, y) := min(max(v, x), y)

DEF dim(x, y) := x > y ? x - y : y - x
DEF dim(x, y) := max(x, y) - min(x, y)

DEF square(x) := x * x
DEF cube(x) := x * x * x

;;; Fixed-point functions

DEF fsquare(x) := MUL(x, x)
DEF fcube(x) := MUL(MUL(x, x), x)
DEF sqrt(x) := POW(x, 0.5)

DEF log2(x) := LOG(x, 2.0)
DEF log10(x) := LOG(x, 10.0)

; works for any `opt Q` fixed-point precision
DEF trunc(x) := x & -1.0

;;; Macro replacement functions

; `db byte(X, Y)` instead of `dn X, Y`
DEF byte(hi, lo) := ((hi & $f) << 4) | (lo & $f)
; `ld bc, word(X, Y)` instead of `lb bc, X, Y`
DEF word(hi, lo) := ((hi & $ff) << 8) | (lo & $ff)

; `dw bigw(X)` instead of `bigdw X`
DEF bigw(x) := ((x & $ff) << 8) | ((x & $ff00) >> 8)

; rgb(31, 16, 0) == rgbhex($ff8000)
DEF rgb(rr, gg, bb) := rr | (gg << 5) | (bb << 10)
DEF rgbhex(hex) := ((hex & $f80000) >> 19) | ((hex & $f800) >> 6) | ((hex & $f8) << 7)
DEF rgbhex(hex) := rgb((hex & $ff0000) >> 19, (hex & $ff00) >> 11, (hex & $ff) >> 3)

DEF tilecoord(x, y) := wTilemap + y * SCRN_X_B + x
DEF attrcoord(x, y) := wAttrmap + y * SCRN_X_B + x
DEF bg0coord(x, y) := _SCRN0 + y * SCRN_VX_B + x
DEF bg1coord(x, y) := _SCRN1 + y * SCRN_VX_B + x

;;; Bit twiddling functions
; http://graphics.stanford.edu/~seander/bithacks.html
; https://en.wikipedia.org/wiki/Find_first_set

DEF ispowof2(x) := x && !(x & (x - 1))

DEF nextpow2(x) := POW(2.0, CEIL(LOG(x, 2.0)))

DEF parity(x) := popcount(x) & 1

DEF ilog2(x) := LOG(x * 1.0, 2.0) / 1.0 ; fails for x >= 32768
DEF ilog2(x) := 2 ** bsr(x)

; these require recursion support
DEF popcnt(x) := x ? (1 + popcnt(x & (x - 1))) : 0
DEF ctz(x) := x ? (x & 1) ? 0 : (1 + ctz(x >> 1)) : 0
DEF clz(x) := x ? ((x & $8000_0000) ? 0 : (1 + clz(x << 1))) : 0

DEF ffs(x) := 32 - clz(x & -x)
DEF ffs(x) := popcount(x ^ ~-x)

gbdev / rgbds

[Feature request] User-defined functions #201

64 and the numerical lexer abuse in #63 are downright silly, and the assembler should probably just fail and exit if it detects weird behavior like that with functions. I can't see something like that appearing in real code.