pyston / pyston_v1

The previous version of Pyston, a faster implementation of the Python programming language. Please use this link for the new repository:
https://github.com/pyston/pyston/
4.89k stars 289 forks source link

Problem with default args. #94

Closed kkszysiu closed 10 years ago

kkszysiu commented 10 years ago

I'm currently implementing count, full count for strings.

So that's why I prepared method:

Box* strCount(BoxedString* self, Box* elt, Box* startbox, Box* endbox) {
    assert(self->cls == str_cls);

    printf("startbox: %li\n", static_cast<BoxedInt*>(startbox)->n);
    printf("endbox: %li\n", static_cast<BoxedInt*>(endbox)->n);
}

And defined it this way:

    str_cls->giveAttr("count", new BoxedFunction(boxRTFunction((void*)strCount, BOXED_INT, 4, 2, true, false), { boxInt(0), boxInt(0) }));

Then im calling it from REPL:

var = 'aaa'
assert var.count('a') == 3

Sadly, output is always:

startbox: 0
endbox: 140734048053600

For:

var = 'aaa'
assert var.count('a', 10, 1) == 1

Output is:

startbox: 10
endbox: 140734557438864

Seems like for more than 3 arguments, theres a problem. Any idea what could be wrong?

undingen commented 10 years ago

I think the fourth argument has to be a Box\ -> It's an array of additional arguments. Try:

Box* strCount(BoxedString* self, Box* elt, Box* startbox, Box** endbox) {
    assert(self->cls == str_cls);

    printf("startbox: %li\n", static_cast<BoxedInt*>(startbox)->n);
    printf("endbox: %li\n", static_cast<BoxedInt*>(endbox[0])->n);
}
kkszysiu commented 10 years ago

Ok, implemented it using Box**.

kmod commented 10 years ago

Ah, you've stumbled into the wonderful world of calling conventions in Pyston. Currently, the C-level calling conventions for Python functions is

Box* func(Box* arg0, Box* arg1, Box* arg2, Box** other_args);

ie the first three arguments will be passed as normal C arguments, which on x86_64 map to registers, and the rest of the arguments are passed in an array (typically located on the stack, but it's up to the caller), which is passed as the fourth argument.

Why does it work this way? One potential alternative is to simply pass all Python-level arguments as C-level arguments; the C calling convention already specifies that some number of arguments are passed in registers and then the rest are passed on the stack, which is pretty much what we're doing. I don't remember the exact reason that I started off doing it this way, but one benefit is that it makes it possible for C code to call Python functions with arbitrary signatures. ie in C there's no way to call another C function that takes N arguments if N isn't known at compile-time. If you put another calling convention on top, then it becomes feasible; another option is to use assembly which I think should be able to handle it.

As a point of comparison, I think both CPython and MicroPython pass all arguments as a single PyTuple / C array, which is conceptually simpler. I wanted to use a calling convention that would allow some arguments to be passed in registers, which makes things more complicated, but it means that functions with at most 3 parameters should be much faster.

undingen commented 10 years ago

I think theoretical we could make all functions have a "foo(int countArgs, ...)" signature and use the normal varargs calling convention. I think that 64bit x86 linux and windows both pass the first 4 arguments in registers and than the rest on the stack. (But would have to read-up - I'm not 100% sure anymore). But I'm also not sure if this would give us any advantage.

undingen commented 10 years ago

It probably has just disadvantages for us (difficult access to the arguments). I just wanted to mention it :-D

kmod commented 10 years ago

Well, the issue is that although in C/C++ you can write a function that takes a variable number of arguments, you can't write a function call that passes a variable number of arguments. One typical way to get around this limitation is to pass a va_list (ex: vprintf), which I would guess is just a pointer; I think this ends up being similar to our current approach.

undingen commented 10 years ago

Ah, you are totally right! I only thought about accessing the arguments in the called function but not how one can create the arguments at runtime.