Closed sroy4899 closed 3 years ago
I think your proposal makes sense although it will do a lot of churning on the stack to re-arrange things.
It might be worth exploring a different strategy where arguments are pushed on the stack in reverse order. So for
(define (f x y z) ...)
the value of z
will be the top frame, y
will be second, x
third. Making this change should just require flipping the variable order in the static environment and changing the definition of compile-app
to put things on the stack in a different order. (There are two ways to accomplish this with compile-app
; evaluate the args in reverse order and push them as you go, or keep the current evaluation order and compute offsets of rsp for where they should go. Since we should be consistent with left-to-right order Racket uses, we should do the latter.)
With this set up, I think variable arity will be easier and more efficient.
Let's say you have a function
(define (f x y . z) ...)
then the body is compiled in a static environment with 3 variables (z, y, x). Just before returning the function should pop three elements of the stack.
When the function is called, the function should check the actual number of arguments is >= 2 and then fix up the stack to be consistent with the static view of 3 elements.
If it's actual arguments = 2, push an empty list on the stack and proceed.
If it's more than 2, pop all but 2 off the stack, make a list, push the pointer and proceed. Since the elements are in reverse order, it's actually set up pretty well to construct the list.
I took another look at the code and realized that we're already pushing arguments on in reverse order (in the sense that the first argument is the furthest up the stack).
Yeah, in 430 we did arguments in reverse order (at least when I took it). I guess I assumed in 838 we reverted that for some reason, but I haven't had a chance to return to this since this morning (and probably still won't until 6) so couldn't say for certain. So I guess without that needing to be done, I'll give your other comments a try and see if I can get things to work out. 🤷
OK. I spent most of my time this weekend that I reserved for this issue delta debugging, putting in random print statements, etc, but I honestly can't seem to find anything wrong with the code, other than the fact that it of course segfaults.
The motivating example I've been using is va.rkt
:
#lang racket
(begin
(define (lst x y . xs) x)
(lst 1 2 3))
for which the code on this branch generates the following assembly va.s
(to which I then added some comments):
global _entry
default rel
section .text
extern _raise_error
_entry:
mov rbx, rdi
mov rax, 16 ; first arg: 1
push rax
mov rax, 32 ; second arg: 2
push rax
mov rax, 48 ; third arg: 3
push rax
mov rcx, 48 ; arity checking - 3 args were provided
call _label_lst_3d7029495b8c04
add rsp, 24
mov rdx, rbx
ret
_label_lst_3d7029495b8c04:
cmp rcx, 32 ; arity checking, at least 2 args needed
jl _raise_error
mov r8, rcx ; r8 holds (imm->bits 3)
sub r8, 32 ; r8 now holds (imm->bits 1) i.e. pop 1 arg off stack and plop into list
mov rax, 152 ; rax holds the empty list
mov r9, 0 ; r9 helps us loop
_loop1235:
cmp r9, r8
je _end1236
mov [rbx + 0], rax ; recursively put cdr(lst) into [rbx]
pop rax ; pop off stack and put it at [rbx + 8]
mov [rbx + 8], rax
mov rax, rbx
or rax, 2 ; indicate rax is a cons-type
add rbx, 16
add r9, 16
jmp _loop1235
_end1236:
push rax ; push our new list onto the stack
mov rax, [rsp + 24] ; locate the first argument by going 3 words up the stack
ret
I've done many iterations of generating a86 for the above code, and running it through asm-interp
after taking out bits and pieces of it, and I think (though this may not be true), that pushing to the stack after compiling this list is one of the things that causes it to segfault without fail, leading me to believe that the problem is with my list generation, which in turn leads me to believe I must have some skewed concept of how the memory is arranged / how stack frames are aligned.
Thanks for all your efforts Shilpa. I know how hard it can be to debug these stack changing operations. Let me have a look and see if I can get to the bottom of it.
Sorry, just got to this now - wasn't trying to give you extra work! I just wanted to get a feel for why the logic in the compiler wasn't working (like in the example I posted above). Although now that I see the commits, maybe the merging from main and fixing the syntax error in interp did the trick?
It's fine -- I don't mind working on it.
Here's what's going on: compared to 430 the calling convention has changed a bit. When you're doing a call, you push a bunch of things on the stack, then you Call. The Callee sees the arguments followed by the return pointer; hence (Offset rsp 0)
is the return address, but you're thinking it's the last argument, which is actually at (Offset rsp 8)
.
I am thinking of changing the calling convention so that the return point goes before the arguments; this makes compiling calls a little more involved, but makes compiling function definitions a little simpler.
In the meantime, I believe this will fix your issue: at the outset of the stack fixing, pop the top frame to a temporary register (this will hold the return address. Now do your code to package things up in a list, popping off elements. When done, push the return address back on.
Wow. Yes, that absolutely did the trick. Just doing an extra pop/push combo fixed it. Thank you so much! If you're planning on doing this refactor, should I add tests and actually push these changes up or sort of hold off?
BTW, I think other than this misunderstanding, the current calling convention makes sense to me, and forgetting the return address was an oversight on my end, so I hope this refactor you have in mind isn't a result of this discussion.
Yay! Yes, please go ahead and add tests; that way if I do make a change I can be sure not to break anything.
Thinking about the change is a result of thinking about how tail calls will work in the current set up. One thing that will need to happen is have callees pop arguments off before returning (currently the caller pops after the function returns). That will be a little more natural if the rp is above the arguments.
Actually now that I think about it, caller-popping is probably inconsistent with variable arity functions because the callee pops some arguments off to make this list, which screws up the caller... Anyway, push some test and I bet I can cook up one that breaks and will require moving to to callee-pops.
Alright, so when I said that one extra pop/push worked, I kind of lied. Turns out, it appeared to work because my test was basic, and adding new tests introduced a vulnerability. Namely, that if I have 4 "extra" arguments, I pop 4 of them off of the stack, but only push 1. In my initial dummy example, I popped one extra argument and pushed one (the list), and so the error didn't present itself, but when you increase the number of varargs, 'rsp
is now all out of wack because of the additional pops without a corresponding push.
Due to that, to make things work out, you need this really ugly secondary loop that will subtract from rsp so that rsp still points to where it should. With this, all the tests work. The only problem is somehow due to that change test/compile.rkt
is failing (i.e. one test fail). Instead of modifying rsp
(which I thought might be illegal), I also tried simply pushing garbage onto the stack to make up for the lost places that we popped, but test/compile.rkt
still had a non-zero exit, so decided to push and ask if you know why it might be failing since I couldn't really parse exactly what was happening in test/compile.rkt
.
I changed the way compile-app
and compile-define
work so that the callee is responsible for popping arguments off the stack before returning. After that change, fix up what you had written was pretty easy.
Now that callees pop arguments, variable arity is pretty simple. If the function takes n mandatory arguments and is given m arguments, check that m >= n and pop m-n args off, constructing a list, then push the list back on, so the list has n+1 args on the stack. Before returning, pop n+1 args off.
I think this is ready to merge in to main now.
Creating this PR kind of as a check-in to the work I've been doing on supporting variable arity functions. (still haven't assigned myself to the issue, because still unsure of if I'll be able to pull it off amidst my other work before Tuesday)
It's obviously failing some tests now, but I thought I'd open this PR to have a discussion about some implementation details.
I did some thinking about how to modify
compile-define
for the newDefn*
variant, and from what we discussed in lecture on Tuesday (?), I think the right idea is to pop things off the stack one by one until you reach the number of extra arguments (args_total - args_necessary
), populate a list, and then push that list onto the stack such that it occupies the same location in the stack that the first extra argument would have originally occupied, so when the function call happens, it'll associate the list parameter as the last parameter. (I hope that makes sense and is the right idea)This PR includes a rough, initial implementation of what I just said above. I think it's mostly right, but I can't seem to get it to work because I think I'm getting lost in the sauce somewhere with how the stack is populated, so I currently get a bus error on some of the local tests I've written. I figured it'd probably be more efficient to create a PR so I could either clear up some misunderstandings and give it another go or give someone else a (wrong, but hopefully workable) baseline.