apply? - Githubissues

technomancy commented 9 years ago

I noticed apply is missing; how much work would it be to add this?

ryansuchocki commented 9 years ago

If we ignore tail-call-elimination, it is quite feasible.

First, we will need list-ref. (This should probably go into stdlib anyway?)

(define (list-ref lst n)
    (if (zero? n)
        (car lst)
        (list-ref (cdr lst) (- n 1))))

Now we may define apply as something like:

(define (apply fnc lst)
    (let ((a (arity fnc)))
        (if (= a (length lst))
            (begin
                (if (= a 0) (fnc))
                (if (= a 1) (fnc (list-ref 0 lst)))
                (if (= a 2) (fnc (list-ref 0 lst) (list-ref 1 lst)))

                ;; ... all the way up to 255 arguments!
            )
            (asm "JMP error_numargs"))))

Of course, this is going to be hideous once we've written out all 255 possible applications. (255 is the maximum arity of a (ms) program because a single byte is used to store arity in a closure.)

What we really want is for apply to build an activation frame on the stack in exactly the same way as the code generator. The assembly subroutine 'proc_call', defined in preamble.s will do the rest of the work for us.

In other words, translate lines 175-190 of codegen.c into microscheme. It should translate very directly, but there are a few conceptual hurdles to jump. I'll leave this as a challenge to you!

(hint: fprintf "some assembly" will become (asm "some assembly")...

technomancy commented 9 years ago

I'm not sure that it's as simple as translating ProcCall to Scheme--the C has a good deal of code that needs to run at compile time if I'm reading this correctly, which currently can't happen from MicroScheme. Unless you mean that all of codegen_emit should be translated to Scheme? I don't think it's feasible to call that function using the FFI since parent_numArgs and outputFile are unknown at runtime.

Maybe it would be easier to factor out the activation frame creation into its own C function and call that from the FFI? I don't know enough about the type representation to say whether there would need to be translation when values cross the FFI boundaries.

ryansuchocki commented 9 years ago

;; Activation frame layout:
;; |----------------| <- new SP
;; | ARG n          | <- new AFP
;; | ARG n-1        | 
;; |   ...          |
;; | ARG 1          |
;; | Return Address |
;; | Saved CCP      |
;; | Saved AFP      |
;; |----------------| <- Saved AFP

;; This is how the AF is represented in memory. 
;; The stack grows 'backwards' in memory, i.e. from the bottom to the top of this diagram. 
;; Using (ASM "PUSH ..."), we must build the AF from the bottom up, iterating through the given
;; list of arguments...

(define lsttmp)

(define (apply fnc lst)
    ;; Save the current AFP:
    (asm "PUSH AFPh")
    (asm "PUSH AFPl")

    ;; Save the current CCP:
    (asm "PUSH CCPh")
    (asm "PUSH CCPl")

    ;; Push the return address. (The sp_apply_ret label is defined below)
    (asm "LDI GP1, hi8(pm(sp_apply_ret))")
    (asm "PUSH GP1")
    (asm "LDI GP1, lo8(pm(sp_apply_ret))")
    (asm "PUSH GP1")

    ;; Now it would be tempting to just do (for-each push-it lst), but the for-each call would
    ;; (temporarily) put its own AF on the stack, so we have to iterate through the list
    ;; manually. (i.e., with an assembly-level loop). We can still use primitives like (null? ...)

    ;; Work with the first list element:
    (set! lsttmp lst)
    (asm "sp_loop_begin:")

        ;; Fetch the current list element, and push it
        (car lsttmp)
        (asm "PUSH CRSl")
        (asm "PUSH CRSh")

        ;; Work with the next list element:
        (set! lsttmp (cdr lsttmp))

        ;; If it's not null, continue:
        (if (not (null? lsttmp))
            (asm "RJMP sp_loop_begin"))

    ;; Evaluate the length of the list (this moves the length into CRS)
    (length lst)
    ;; Move it into the PCR. (i.e., tell the proc_call routine how many arguments were given)
    ;;   (so that it can check it against the closure, and raise an exception if different)
    (asm "MOV PCR, CRSl")

    ;; Now we evaluate the function expression itself, (moves fnc into CRS)
    fnc

    ;; Too complicated for comments :)
    (asm "IN AFPl, SPl")
    (asm "IN AFPh, SPh")
    (asm "MOVW GP5, AFPl")

    ;; Here we call proc_call, which does type/argument number checks then performs the context switch
    (asm "JMP proc_call")

    ;; Since we set the return address to sp_apply_ret, control will eventually return here:
    ;; (CRS will now contain the result of the application. We simply leave it in place.)
    (asm "sp_apply_ret:")

    ;; Restore the old CCP
    (asm "POP CCPl")
    (asm "POP CCPh")

    ;; Restore the old AFP, and return to the caller!
    (asm "POP AFPl")
    (asm "POP AFPh"))

:)

ryansuchocki commented 9 years ago

This seems to flash (pin 13) 9 times, as expected:

(define (flash pin times)
    (if (zero? times)
        #t
        (begin
            (high pin)
            (pause 250)
            (low pin)
            (pause 250)
            (flash pin (- times 1)))))

(define (+++ x y z)
    (+ (+ x y) z))

(apply flash (list 13 (apply +++ (list 2 4 3))))

technomancy commented 9 years ago

Aha; so it appears asm is enough to do what we need at compile time. This isn't really "writing apply in Scheme" so much as "writing apply in ASM in Scheme", so I feel a bit better about not being able to figure it out on my own. =)

Thanks!

technomancy commented 9 years ago

Just took a look at this and while it works with regular functions it appears not to work with call-c-func, presumably because it is a special form.

I saw the compiler refuses to use list and vector as values because they're variadic-primitive. Should call-c-func be added to that blacklist? If so does that mean we'll need a separate apply-c-func primitive?

ryansuchocki commented 9 years ago

The following are all 'fundemental forms', and have no value: define lambda let if set! begin and or free! include list vector call-c-func

Since the FFI only supports functions with up to 10 arguments (at the moment), apply-c-func could be expressed like so: (untested)

(define (apply-c-func fnc lst)
  (let ((a (length lst)))
    (if (= a 0) (call-c-func fnc))
    (if (= a 1) (call-c-func fnc (list-ref 0 lst)))
    (if (= a 2) (call-c-func fnc (list-ref 0 lst) (list-ref 1 lst)))
    (if (= a 3) (call-c-func fnc (list-ref 0 lst) (list-ref 1 lst) (list-ref 2 lst)))
    (if (= a 4) (call-c-func fnc (list-ref 0 lst) (list-ref 1 lst) (list-ref 2 lst) 
        (list-ref 3 lst)))
    ;; And so on, up to 10
    (if (> a 10) (error))))

You might be able to get something like that working, but it's too fragile to provide it in the standard lib.

One neat solution would be to introduce proper variadic procedures into microscheme. Variadic functions are really syntactic sugar for functions accepting (boundless) lists of arguments. They were originally left out because they add quite a lot of complexity to the compiler, and you can achieve the same computation by explicitly passing around lists of things... (This is one of the key places where (ms) is a subset of scheme.)

Another solution would be to stick with the (ms) philosophy of passing around lists, even across the FFI, and write C wrappers that expect lists. This would not be very difficult. (listsum in ffitest.c shows how to recurse through a Scheme list from the C side...)

ryansuchocki commented 9 years ago

In fact, passing vectors across the FFI would be even better. (Because iteration is more natural than recursion in C.) See vectsum in ffitest.c.

technomancy commented 9 years ago

In this case modifying my C function to take a list or vector seems like the way to go, yeah.

The following are all 'fundemental forms', and have no value:

Seems like the error reporting could be improved on this. In parser.c there's that error message when you try to take the value of list and vector--should this be expanded to check for all fundamental forms?

Apart from that I think we can close this issue out. Do you want to add the apply implementation in the snippet above? I could submit it as a PR if you like too if you're busy but I thought it'd be good to have proper attribution.

ryansuchocki commented 9 years ago

(apply) pushed now...

You're right, the error reporting is inconsistent. The real cause is that list, vector and call-c-func are implemented, within the compiler, as primitives rather than fundamentals.

Once that's changed, a new case "case Keyword:" should be added to the outermost switch in the parser, giving a helpful error saying "fundemental form has no value". This should gracefully catch all the appropriate cases...

Feel free to give it a shot, as I'm tied up for a couple of days.

technomancy commented 9 years ago

You're right, the error reporting is inconsistent. The real cause is that list, vector and call-c-func are implemented, within the compiler, as primitives rather than fundamentals.

Can you elaborate on the difference? I thought "fundamental" was a synonym for "special form" and "primitive", meaning simply "implemented in C rather than Scheme". But it seems there's a further distinction?

ryansuchocki commented 9 years ago

Ah ok. This is a bit rushed, but I hope it helps:

Procedures (including primitive ones) are always applied with the form (procedure args...), and the arguments are always call-by-value, evaluated left-to-right. Every argument is evaluated, even if it's not used. Importantly, every procedure, whether built-in or user defined) is or has a value, i.e. it's a `first-class citizen'.

Primitive procedures are generally built-in, and implemented in C/assembly. Built-in procedures implemented in Scheme are sometimes called 'library procedures', but this distinction isn't really important.

Fundamental forms are not procedures, and their syntax and semantics are much more flexible. Consider the form (if x a b). The expression a is only evaluated if the condition x is true. If if were a procedure, then x a b would be its arguments, and they would all be evaluated before the if code even begins... In fact, if is a keyword which appears in the (if x a b) form, and it has no value. If you think about lambda: its semantics are even further from that of a procedure taking some arguments...

Note that this is true in all Schemes. If you try to take the value of if in mit-scheme, you get "Syntactic keyword may not be used as an expression". Racket just says "bad syntax". Guile gives the rather cryptic "unknown location: source expression failed to match any pattern in form if".

For Microscheme in particular, we can have variadic fundamental forms such as list, vector and call-c-func, because the code generator can generate different code depending on the number of "arguments". But we can't have variadic procedures because the runtime system is built around every procedure expecting a fixed number of arguments... This is why they ought to be fundamental forms.

This distinction is why Scheme is sometimes considered a "small language", and why it's particularly easy to parse... Once you've implemented around 10 fundamental forms, everything else (even arithmetic and logic) is a special case of the 'procedure' form. Contrast this with C/Java (which have hundreds of keywords) or Haskell (which has many complicated syntax variations).

technomancy commented 9 years ago

I see--I'm just used to different terminology. I call them "special forms" instead of "fundamentals". Also I'm used to special forms only being used to implement macro-like things, but I guess if you don't have varargs for procedures it does make sense to implement list and vector that way. Thanks for the clarification.

I think we can close this out now that apply is in master. Thanks!

ryansuchocki / microscheme

apply? #11