plum-umd / the-838e-compiler

Compiler for CMSC 838E
2 stars 0 forks source link

Draft of support for variable arity functions #29

Closed sroy4899 closed 3 years ago

sroy4899 commented 3 years ago

Creating this PR kind of as a check-in to the work I've been doing on supporting variable arity functions. (still haven't assigned myself to the issue, because still unsure of if I'll be able to pull it off amidst my other work before Tuesday)

It's obviously failing some tests now, but I thought I'd open this PR to have a discussion about some implementation details.

I did some thinking about how to modify compile-define for the new Defn* variant, and from what we discussed in lecture on Tuesday (?), I think the right idea is to pop things off the stack one by one until you reach the number of extra arguments (args_total - args_necessary), populate a list, and then push that list onto the stack such that it occupies the same location in the stack that the first extra argument would have originally occupied, so when the function call happens, it'll associate the list parameter as the last parameter. (I hope that makes sense and is the right idea)

This PR includes a rough, initial implementation of what I just said above. I think it's mostly right, but I can't seem to get it to work because I think I'm getting lost in the sauce somewhere with how the stack is populated, so I currently get a bus error on some of the local tests I've written. I figured it'd probably be more efficient to create a PR so I could either clear up some misunderstandings and give it another go or give someone else a (wrong, but hopefully workable) baseline.

dvanhorn commented 3 years ago

I think your proposal makes sense although it will do a lot of churning on the stack to re-arrange things.

It might be worth exploring a different strategy where arguments are pushed on the stack in reverse order. So for

(define (f x y z) ...)

the value of z will be the top frame, y will be second, x third. Making this change should just require flipping the variable order in the static environment and changing the definition of compile-app to put things on the stack in a different order. (There are two ways to accomplish this with compile-app; evaluate the args in reverse order and push them as you go, or keep the current evaluation order and compute offsets of rsp for where they should go. Since we should be consistent with left-to-right order Racket uses, we should do the latter.)

With this set up, I think variable arity will be easier and more efficient.

Let's say you have a function

(define (f x y . z) ...)

then the body is compiled in a static environment with 3 variables (z, y, x). Just before returning the function should pop three elements of the stack.

When the function is called, the function should check the actual number of arguments is >= 2 and then fix up the stack to be consistent with the static view of 3 elements.

If it's actual arguments = 2, push an empty list on the stack and proceed.

If it's more than 2, pop all but 2 off the stack, make a list, push the pointer and proceed. Since the elements are in reverse order, it's actually set up pretty well to construct the list.

dvanhorn commented 3 years ago

I took another look at the code and realized that we're already pushing arguments on in reverse order (in the sense that the first argument is the furthest up the stack).

sroy4899 commented 3 years ago

Yeah, in 430 we did arguments in reverse order (at least when I took it). I guess I assumed in 838 we reverted that for some reason, but I haven't had a chance to return to this since this morning (and probably still won't until 6) so couldn't say for certain. So I guess without that needing to be done, I'll give your other comments a try and see if I can get things to work out. 🤷

sroy4899 commented 3 years ago

OK. I spent most of my time this weekend that I reserved for this issue delta debugging, putting in random print statements, etc, but I honestly can't seem to find anything wrong with the code, other than the fact that it of course segfaults.
The motivating example I've been using is va.rkt:

#lang racket
(begin  
(define (lst x y . xs) x) 
(lst 1 2 3))

for which the code on this branch generates the following assembly va.s (to which I then added some comments):

        global _entry
        default rel
        section .text
        extern _raise_error
_entry:
        mov rbx, rdi
        mov rax, 16                 ; first arg: 1
        push rax
        mov rax, 32                 ; second arg: 2
        push rax
        mov rax, 48                 ; third arg: 3 
        push rax          
        mov rcx, 48                ; arity checking - 3 args were provided

        call _label_lst_3d7029495b8c04

        add rsp, 24
        mov rdx, rbx
        ret
_label_lst_3d7029495b8c04:
        cmp rcx, 32                 ; arity checking, at least 2 args needed
        jl _raise_error
        mov r8, rcx                  ; r8 holds (imm->bits 3)
        sub r8, 32                    ; r8 now holds (imm->bits 1) i.e. pop 1 arg off stack and plop into list
        mov rax, 152               ; rax holds the empty list
        mov r9, 0                     ; r9 helps us loop
_loop1235:
        cmp r9, r8                   
        je _end1236
        mov [rbx + 0], rax       ; recursively put cdr(lst) into [rbx]
        pop rax                        ; pop off stack and put it at [rbx + 8]
        mov [rbx + 8], rax      
        mov rax, rbx
        or rax, 2                        ; indicate rax is a cons-type
        add rbx, 16 
        add r9, 16
        jmp _loop1235
_end1236:
        push rax                         ; push our new list onto the stack
        mov rax, [rsp + 24]       ; locate the first argument by going 3 words up the stack
        ret

I've done many iterations of generating a86 for the above code, and running it through asm-interp after taking out bits and pieces of it, and I think (though this may not be true), that pushing to the stack after compiling this list is one of the things that causes it to segfault without fail, leading me to believe that the problem is with my list generation, which in turn leads me to believe I must have some skewed concept of how the memory is arranged / how stack frames are aligned.

dvanhorn commented 3 years ago

Thanks for all your efforts Shilpa. I know how hard it can be to debug these stack changing operations. Let me have a look and see if I can get to the bottom of it.

sroy4899 commented 3 years ago

Sorry, just got to this now - wasn't trying to give you extra work! I just wanted to get a feel for why the logic in the compiler wasn't working (like in the example I posted above). Although now that I see the commits, maybe the merging from main and fixing the syntax error in interp did the trick?

dvanhorn commented 3 years ago

It's fine -- I don't mind working on it.

Here's what's going on: compared to 430 the calling convention has changed a bit. When you're doing a call, you push a bunch of things on the stack, then you Call. The Callee sees the arguments followed by the return pointer; hence (Offset rsp 0) is the return address, but you're thinking it's the last argument, which is actually at (Offset rsp 8).

I am thinking of changing the calling convention so that the return point goes before the arguments; this makes compiling calls a little more involved, but makes compiling function definitions a little simpler.

In the meantime, I believe this will fix your issue: at the outset of the stack fixing, pop the top frame to a temporary register (this will hold the return address. Now do your code to package things up in a list, popping off elements. When done, push the return address back on.

sroy4899 commented 3 years ago

Wow. Yes, that absolutely did the trick. Just doing an extra pop/push combo fixed it. Thank you so much! If you're planning on doing this refactor, should I add tests and actually push these changes up or sort of hold off?

BTW, I think other than this misunderstanding, the current calling convention makes sense to me, and forgetting the return address was an oversight on my end, so I hope this refactor you have in mind isn't a result of this discussion.

dvanhorn commented 3 years ago

Yay! Yes, please go ahead and add tests; that way if I do make a change I can be sure not to break anything.

Thinking about the change is a result of thinking about how tail calls will work in the current set up. One thing that will need to happen is have callees pop arguments off before returning (currently the caller pops after the function returns). That will be a little more natural if the rp is above the arguments.

Actually now that I think about it, caller-popping is probably inconsistent with variable arity functions because the callee pops some arguments off to make this list, which screws up the caller... Anyway, push some test and I bet I can cook up one that breaks and will require moving to to callee-pops.

sroy4899 commented 3 years ago

Alright, so when I said that one extra pop/push worked, I kind of lied. Turns out, it appeared to work because my test was basic, and adding new tests introduced a vulnerability. Namely, that if I have 4 "extra" arguments, I pop 4 of them off of the stack, but only push 1. In my initial dummy example, I popped one extra argument and pushed one (the list), and so the error didn't present itself, but when you increase the number of varargs, 'rsp is now all out of wack because of the additional pops without a corresponding push.

Due to that, to make things work out, you need this really ugly secondary loop that will subtract from rsp so that rsp still points to where it should. With this, all the tests work. The only problem is somehow due to that change test/compile.rkt is failing (i.e. one test fail). Instead of modifying rsp (which I thought might be illegal), I also tried simply pushing garbage onto the stack to make up for the lost places that we popped, but test/compile.rkt still had a non-zero exit, so decided to push and ask if you know why it might be failing since I couldn't really parse exactly what was happening in test/compile.rkt.

dvanhorn commented 3 years ago

I changed the way compile-app and compile-define work so that the callee is responsible for popping arguments off the stack before returning. After that change, fix up what you had written was pretty easy.

Now that callees pop arguments, variable arity is pretty simple. If the function takes n mandatory arguments and is given m arguments, check that m >= n and pop m-n args off, constructing a list, then push the list back on, so the list has n+1 args on the stack. Before returning, pop n+1 args off.

I think this is ready to merge in to main now.