dibyendumajumdar / ravi

Ravi is a dialect of Lua, featuring limited optional static typing, JIT and AOT compilers
http://ravilang.github.io/
Other
1.16k stars 60 forks source link

Ravi evolution proposal #223

Open snoopcatt opened 3 years ago

snoopcatt commented 3 years ago

Hello! We have some proposals about Ravi development & support.

Our team wish to use Ravi in production. So, if you need some our help & contributing (patches, bounties, etc.) we need to talk with you somewhere ☺

If interested, please, share some your contact.

dibyendumajumdar commented 3 years ago

Hi - thank you for your interest. I can't accept bounties but if you would like to contribute patches/improvements, will be happy to consider them.

snoopcatt commented 3 years ago

Understood, thank you for your reply. Do you have any roadmap and/or future release plans? Planned/desired features maybe? And in general, what are your plans for this project?

I've researched your project within a few weeks, and I have some questions and proposals.


MIR JIT

Why do you choose MIR JIT? It lacks MIPS architecture support. Running on ARM64 is a problem too. It does not look ready at all.

Also, why we have to compile functions manually? Why not to do it at start, as LuaJIT does?

Performance

Ravi is 3-5 times slower than vanilla Lua 5.3, and about 10 times slower than LuaJIT. I'm talking about real use, not synthetic benchmarks. Networking, in my case (HTTP server, PostgreSQL client). Of course, code was the same for all interpreters and static typing was not used. JIT was enabled too.

Yes, Ravi still leaves far, far behind all smoothie technologies such as Python (with asyncio) and V8 (NodeJS). But boost at least to vanilla Lua 5.3 level will be delightful ☺

Static typing

Static typing is great thing, and you have done great job implementing this. However, there is a lot to improve here.

Some types (such as string or closure) allows variable to be nil.

This leads to the fact that we need additional checks/asserts in a function body.

There is no boolean type, which is very useful.

There was, as I can see within commented code, but you dropped this out? Why? Is there a some caveat here?

Function arguments can not be optional.

It means that if we need type checking on function arguments, they can not be optional (or nil). Moreover, string, closure and userdata can be optional, but integer, number and table — not. Confusing a lot.

Useful (but not mandatory) features for typing system


I'm looking forward to your reply. I almost finished my work on typing system misbehaviour, but I'm not sure that we have same vision how it should work. Also I'm not sure in my code and implementation quality (I had to add dozens opcodes) and it of course will break backward compability.

Accordingly to that, I wouldn't open PR now, but in next my post I will attach a patch with explainations and some code examples with new typing system behaviour.

Thank you for your time! ♥

snoopcatt commented 3 years ago

As I said before, I've done some work to change some confusing behaviour of typing system (and improve it a bit).

Function arguments with types string, closure, table, userdata can not be nil.

Example:

local function hello(a: string)
    print(a, ravitype(a))
end

hello("world")
hello()

Current behaviour: hello string nil nil

New behaviour: world string ravi: test_string.ravi:0: string expected

Function arguments now may be optional.

Each argument type (except integer[] and number[]) now may be optional. Just append ? (question mark) to type name.
Example:

local function sum(a: integer, b: integer, c: integer?)
    return a + b + (c or 0)
end

print(sum(1, 2, 3))
print(sum(1,2))
print(sum(1))

Output: 6 3 ravi: test_optional.ravi:0: TOINT: integer expected

Added boolean type.

Example:

local function hello(b: boolean)
        print(b, ravitype((b))
end

hello(true) -- `true, boolean`
hello("world") -- error: ravi: test_boolean.ravi:0: boolean expected

So, here is the patch

5359ca1d5ff50204e29ff594c271d55973f9f646.patch.txt

Implementation may be not correct at all, I'm sure that it can be better. But before rigorous testing and trying to optimize it some way, I need to know your opinion about this things.

Of course, it will break backward compatibility (because currently function accepts arg: string even if it is nil), but IMO modified behaviour is more predictable and understandable. TypedLua also works the same way.


Will look forward for your response! Thank you for attention.

dibyendumajumdar commented 3 years ago

Hi thank you for above. I will respond later when I have time.

In general, please keep PRs separate by each feature - and also ensure you have run tests/added tests.

One thing worth noting is we cannot allow NIL for integer/number types or table/integer[]/number[] types because that will mess up the JIT

snoopcatt commented 3 years ago

One thing worth noting is we cannot allow NIL for integer/number types or table/integer[]/number[] types because that will mess up the JIT

Yea, agree with you about table/integer[]/number[]. But what's wrong with integer/number types?

I made some quick tests, all looks good.

ravi.jit(true)

function sum(a: integer, b: integer, c: integer?)
        return a + b + (c or 0)
end

ravi.compile(sum)

for i=1,10000000 do
        assert(sum(10, 20), 30)
        assert(sum(10, 20, 70), 100)
end

print 'done'

What do you mean about messing up the JIT?

dibyendumajumdar commented 3 years ago

ravi generates special bytecodes for integer/number types when annotated You can see by running:

ravi.dumplua(function)

There are no NIL checks as the whole point is to do optimized arithmetic.

dibyendumajumdar commented 3 years ago

Also new compiler will allocate on stack

snoopcatt commented 3 years ago

ravi generates special bytecodes for integer/number types when annotated You can see by running:

ravi.dumplua(function)

There are no NIL checks as the whole point is to do optimized arithmetic.

Still not understanding how it theoretically can affect runtime when using optional types. By the way, there are extra opcodes for optional types with nil-checking , like

// optional types
vmcase(OP_RAVI_TOINT_NIL) {
lua_Integer j;
if (RAVI_LIKELY(tointeger(ra, &j))) { setivalue(ra, j); }
else if(!ttisnil(ra))
  luaG_runerror(L, "TOINT: integer expected");
vmbreak;
}

Can you, please, explain me the main danger with some example code? Or it is just about performance?

Optional typed arguments is very important thing, without it it would be hard to implement some kind of applications.

dibyendumajumdar commented 3 years ago

There are two distinct goals for type annotations.

Ravi's implementation is a hybrid. Some types are for performance. Others are just for argument / return type checking - but they do not help performance

To support optional integer/number would need the compiler to revert to unoptimized type as NIL is not a valid value

snoopcatt commented 3 years ago

To support optional integer/number would need the compiler to revert to unoptimized type as NIL is not a valid value

Can you clarify where to look? Commit number/whatever.

I also still do not understand what exactly wrong with my implementation. Optional integers/numbers are working well, performance is the same as with mandatory types (as in current upstream revision).

Are there any potential critical bugs may hide? Such as crashes, memory corruption, et cetera.

XmiliaH commented 3 years ago

ravi generates special bytecodes for integer/number types when annotated You can see by running:

ravi.dumplua(function)

There are no NIL checks as the whole point is to do optimized arithmetic.

Note that this is currently handled correctly.

The following code produces an int|nil value with (n and a or nil). The following ADD opcode is not optimized as the value can be nil.

ravi.jit(true)

function sum(n, a: integer, b: integer)
        return (n and a or nil) + b
end

ravi.dumplua(sum)
snoopcatt commented 3 years ago
ravi.jit(true)

function sum(n, a: integer, b: integer)
        return (n and a or nil) + b
end

ravi.dumplua(sum)

I think this code is not correct at all. If n is not specified, you will try to sum b and nil.

Slightly modified code:

ravi.jit(true)

function sum(n: integer?, a: integer, b: integer)
    return a + b + (n or 0)
end

-- ravi.dumplua(sum)

print(sum(nil, 1, 2))
print(sum(1, 2, 3))
print(sum(1, 2))

works and returns 3 6 ravi: jit.lua:0: TOINT: integer expected

But I have no idea on bytecode, I don't know how to read it.

XmiliaH commented 3 years ago

If n is not specified, you will try to sum b and nil.

That was intended to show that in such cases the unoptimized ADD instruction is used.

It should show that

function sum(a: integer?, b: integer)
        return a + b
end

is not using the RAVI_ADD_II but the generic ADD.

snoopcatt commented 3 years ago

I tried, it seems my code producing ADD_II but your code producing ADD.

ravi.jit(true)

function sum(a: integer, b: integer)
    return a+b
end

function sum2(n: integer?, a: integer, b: integer)
    return a + b + (n or 0)
end

function sum3(n, a: integer, b: integer)
        return (n and a or nil) + b
end

print('--------------------------------------------')
print [[function sum(a: integer, b: integer)
    return a+b
end]]
ravi.dumplua(sum)

print('--------------------------------------------')
print [[function sum2(n: integer?, a: integer, b: integer)
        return a + b + (n or 0)
end]]
ravi.dumplua(sum2)

print('--------------------------------------------')
print [[function sum3(n, a: integer, b: integer)
        return (n and a or nil) + b
end]]
ravi.dumplua(sum3)

result:

--------------------------------------------
function sum(a: integer, b: integer)
    return a+b
end

function <jit.lua:3,5> (5 instructions at 0x7f3742c79610)
2 params, 3 slots, 0 upvalues, 2 locals, 0 constants, 0 functions
    1   [3] TOINT       0
    2   [3] TOINT       1
    3   [4] ADDII       2 0 1
    4   [4] RETURN      2 2
    5   [5] RETURN      0 1
constants (0) for 0x7f3742c79610:
locals (2) for 0x7f3742c79610:
    0   a   1   6
    1   b   1   6
upvalues (0) for 0x7f3742c79610:
--------------------------------------------
function sum2(n: integer?, a: integer, b: integer)
        return a + b + (n or 0)
end

function <jit.lua:7,9> (10 instructions at 0x7f3742c793f0)
3 params, 5 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
    1   [7] TOIARRAY    0
    2   [7] TOINT       1
    3   [7] TOINT       2
    4   [8] ADDII       3 1 2
    5   [8] TESTSET     4 0 1
    6   [8] JMP         0 1 ; to 8
    7   [8] LOADK       4 -1    ; 0
    8   [8] ADDII       3 3 4
    9   [8] RETURN      3 2
    10  [9] RETURN      0 1
constants (1) for 0x7f3742c793f0:
    1   0
locals (3) for 0x7f3742c793f0:
    0   n   1   11
    1   a   1   11
    2   b   1   11
upvalues (0) for 0x7f3742c793f0:
--------------------------------------------
function sum3(n, a: integer, b: integer)
        return (n and a or nil) + b
end

function <jit.lua:11,13> (10 instructions at 0x7f3742c7e760)
3 params, 4 slots, 0 upvalues, 3 locals, 0 constants, 0 functions
    1   [11]    TOINT       1
    2   [11]    TOINT       2
    3   [12]    TEST        0 0
    4   [12]    JMP         0 2 ; to 7
    5   [12]    TESTSET     3 1 1
    6   [12]    JMP         0 1 ; to 8
    7   [12]    LOADNIL     3 0
    8   [12]    ADD         3 3 2
    9   [12]    RETURN      3 2
    10  [13]    RETURN      0 1
constants (0) for 0x7f3742c7e760:
locals (3) for 0x7f3742c7e760:
    0   n   1   11
    1   a   1   11
    2   b   1   11
upvalues (0) for 0x7f3742c7e760:
XmiliaH commented 3 years ago

All of the code seems to be correct. For the second case ADD_II can be used since n or 0 will be an integer and never nil. However, the TOIARRAY seems out of place, but this is caused by missing to add all the new opcodes to luaP_opnames.

OP_RAVI_TOINT_NIL
OP_RAVI_TOFLT_NIL
OP_RAVI_TOTAB_NIL
OP_RAVI_TOSTRING_NIL
OP_RAVI_TOBOOLEAN_NIL
OP_RAVI_TOCLOSURE_NIL
OP_RAVI_TOTYPE_NIL

are missing. It would be interesting to see the opcodes of

function sum(a: integer?, b: integer)
        return a + b
end

I didn't apply your patch so far.

dibyendumajumdar commented 3 years ago

Also these opcodes might be redundant - as the the existing ones already allow NIL where applicable

dibyendumajumdar commented 3 years ago

Re reading bytecodes - you may want to checkout https://the-ravi-programming-language.readthedocs.io/en/latest/lua_bytecode_reference.html. It doesn't cover the extended Ravi opcodes but these are straight forward once you can follow the Lua 5.3 ones

XmiliaH commented 3 years ago

The existing ones are patched to not allow nil any more. See the patch supplied in https://github.com/dibyendumajumdar/ravi/issues/223#issuecomment-844675753.

snoopcatt commented 3 years ago

All of the code seems to be correct. For the second case ADD_II can be used since n or 0 will be an integer and never nil. However, the TOIARRAY seems out of place, but this is caused by missing to add all the new opcodes to luaP_opnames.

yeah, missed it, thanks.

function sum(a: integer?, b: integer)
        return a + b
end

Of course it produces ADD, but as I mentioned before, code is incorrect itself. You should avoid using add + (and concat .. for strings too) without being sure that both values are correct type. Or it'll just crash.

Another correct variant (also produces ADDII)

ravi.jit(true)

function sum(a: integer?, b: integer)
        return @integer(a) + b
end

ravi.dumplua(sum)

but it is almost the same as (a or 0).

By the way, main purpose of optional typed arguments is to ensure that external variable (if supplied) is correct type. And if variable is not supplied, you should like always assign some default value to it.

I mean,

function hello(s: string?)
  s = s or 'world'
  return "hello, " .. s
end

Without support of optional arguments, we have to do something like that:

function hello(s)
  s = s or 'world'
  assert(type(s), 'string')
  return "hello, " .. s
end
dibyendumajumdar commented 3 years ago

For optional integer/number you will then need to set the value to 0 rather than NIL - and ensure type is set. Then I think it might be okay.

XmiliaH commented 3 years ago

Note that

function hello(s: string?)
  s = s or 'world'
  return "hello, " .. s
end

is not optimal since s still may hold nil in the eyes of the optimizer at the point of concatenation. Better would be

function hello(s: string?)
  local rs: string = s or 'world'
  return "hello, " .. rs
end
snoopcatt commented 3 years ago

Also these opcodes might be redundant - as the the existing ones already allow NIL where applicable

That's what I said in original post. Try to run this code:

function hello(s: string)
  return "hello, " .. s
end
print(hello("world")) -- works as intended
print(hello(false)) -- works as intended, throws `ravi: r.lua:0: string expected`
print(hello()) -- crash -- `ravi: r.lua:2: attempt to concatenate a nil value (local 's')`

For some reason, string type allows nil as correct argument type, that may lead to serious errors and crashes.

dibyendumajumdar commented 3 years ago

Well - actually a few of the types are union of type & NIL. @XmiliaH made this more explicit in the code.

#define RAVI_TM_FALSISH (RAVI_TM_NIL | RAVI_TM_FALSE)
#define RAVI_TM_TRUISH (~RAVI_TM_FALSISH)
#define RAVI_TM_BOOLEAN (RAVI_TM_FALSE | RAVI_TM_TRUE)
#define RAVI_TM_NUMBER (RAVI_TM_INTEGER | RAVI_TM_FLOAT)
#define RAVI_TM_INDEXABLE (RAVI_TM_INTEGER_ARRAY | RAVI_TM_FLOAT_ARRAY | RAVI_TM_TABLE)
#define RAVI_TM_STRING_OR_NIL (RAVI_TM_STRING | RAVI_TM_NIL)
#define RAVI_TM_FUNCTION_OR_NIL (RAVI_TM_FUNCTION | RAVI_TM_NIL)
#define RAVI_TM_BOOLEAN_OR_NIL (RAVI_TM_BOOLEAN | RAVI_TM_NIL)
#define RAVI_TM_USERDATA_OR_NIL (RAVI_TM_USERDATA | RAVI_TM_NIL)
#define RAVI_TM_ANY (~0)

typedef enum {
  RAVI_TNIL = RAVI_TM_NIL,        /* NIL */
  RAVI_TNUMINT = RAVI_TM_INTEGER, /* integer number */
  RAVI_TNUMFLT = RAVI_TM_FLOAT,   /* floating point number */
  RAVI_TNUMBER = RAVI_TM_NUMBER,
  RAVI_TARRAYINT = RAVI_TM_INTEGER_ARRAY,   /* array of ints */
  RAVI_TARRAYFLT = RAVI_TM_FLOAT_ARRAY,     /* array of doubles */
  RAVI_TTABLE = RAVI_TM_TABLE,              /* Lua table */
  RAVI_TSTRING = RAVI_TM_STRING_OR_NIL,     /* string */
  RAVI_TFUNCTION = RAVI_TM_FUNCTION_OR_NIL, /* Lua or C Function */
  RAVI_TBOOLEAN = RAVI_TM_BOOLEAN_OR_NIL,   /* boolean */
  RAVI_TTRUE = RAVI_TM_TRUE,
  RAVI_TFALSE = RAVI_TM_FALSE,
  RAVI_TUSERDATA = RAVI_TM_USERDATA_OR_NIL, /* userdata or lightuserdata */
  RAVI_TANY = RAVI_TM_ANY,                  /* Lua dynamic type */
} ravitype_t;
snoopcatt commented 3 years ago

For optional integer/number you will then need to set the value to 0 rather than NIL - and ensure type is set. Then I think it might be okay.

Also a solution. You won't be able to set default variables with n = n or 1024, you have to use n = (n ~= 0) and n or 1024. But the problem is, that 0 is also valid number, and we may expect it sometimes as correct value.

dibyendumajumdar commented 3 years ago

I had a todo https://github.com/dibyendumajumdar/ravi/issues/172 Because one option is TOINT/TOFLT should convert to integer/float where possible (maybe NIL to 0)

snoopcatt commented 3 years ago

Well - actually a few of the types are union of type & NIL. @XmiliaH made this more explicit in the code.

#define RAVI_TM_FALSISH (RAVI_TM_NIL | RAVI_TM_FALSE)
#define RAVI_TM_TRUISH (~RAVI_TM_FALSISH)
#define RAVI_TM_BOOLEAN (RAVI_TM_FALSE | RAVI_TM_TRUE)
#define RAVI_TM_NUMBER (RAVI_TM_INTEGER | RAVI_TM_FLOAT)
#define RAVI_TM_INDEXABLE (RAVI_TM_INTEGER_ARRAY | RAVI_TM_FLOAT_ARRAY | RAVI_TM_TABLE)
#define RAVI_TM_STRING_OR_NIL (RAVI_TM_STRING | RAVI_TM_NIL)
#define RAVI_TM_FUNCTION_OR_NIL (RAVI_TM_FUNCTION | RAVI_TM_NIL)
#define RAVI_TM_BOOLEAN_OR_NIL (RAVI_TM_BOOLEAN | RAVI_TM_NIL)
#define RAVI_TM_USERDATA_OR_NIL (RAVI_TM_USERDATA | RAVI_TM_NIL)
#define RAVI_TM_ANY (~0)

typedef enum {
  RAVI_TNIL = RAVI_TM_NIL,        /* NIL */
  RAVI_TNUMINT = RAVI_TM_INTEGER, /* integer number */
  RAVI_TNUMFLT = RAVI_TM_FLOAT,   /* floating point number */
  RAVI_TNUMBER = RAVI_TM_NUMBER,
  RAVI_TARRAYINT = RAVI_TM_INTEGER_ARRAY,   /* array of ints */
  RAVI_TARRAYFLT = RAVI_TM_FLOAT_ARRAY,     /* array of doubles */
  RAVI_TTABLE = RAVI_TM_TABLE,              /* Lua table */
  RAVI_TSTRING = RAVI_TM_STRING_OR_NIL,     /* string */
  RAVI_TFUNCTION = RAVI_TM_FUNCTION_OR_NIL, /* Lua or C Function */
  RAVI_TBOOLEAN = RAVI_TM_BOOLEAN_OR_NIL,   /* boolean */
  RAVI_TTRUE = RAVI_TM_TRUE,
  RAVI_TFALSE = RAVI_TM_FALSE,
  RAVI_TUSERDATA = RAVI_TM_USERDATA_OR_NIL, /* userdata or lightuserdata */
  RAVI_TANY = RAVI_TM_ANY,                  /* Lua dynamic type */
} ravitype_t;

I also confused with that. RAVI_TNUMBER is RAVI_TM_NUMBER, but RAVI_TFUNCTION is RAVI_TM_FUNCTION_OR_NIL. We may expect closure as input value in some function, but still may get nil instead.
So we need to perform additional checks, despite that we designated expected type in arguments list.

dibyendumajumdar commented 3 years ago

We can tighten the check for annotated types to disallow NIL. I didn't do it originally as I took the view that NIL is a valid value for these (i.e. absent)

snoopcatt commented 3 years ago

Also, compromise solution may be to implement union types. It is already implemented on C-side, we just need to implement Lua interface to create custom unions.

They we may explicitly write

function sum(a: integer, b: integer|nil)
        return @integer(a) + b
end

when needed (and when we know what we doing).

And also If it's possible not for a big cost for parser, maybe add possibility to specify literals directly, i.e.

function sum(a: integer, b: integer|666)
        return a + b
end
function hello(s: string|"World")
       return 'hello ' .. s
end 

I'm not sure that it is possible without lot of workarounds, but I might be wrong.

XmiliaH commented 3 years ago

What should

function hello(s: string|"World")
       return 'hello ' .. s
end

mean? That only the string "World" can be supplied?

The unions work well, but there are issued with union of more than one custom type.

snoopcatt commented 3 years ago

That only the string "World" can be supplied?

It should mean default value, if none specified. Like in Python def student(firstname, lastname ='Mark', standard ='Fifth'): (and many other languages)

But I'm not sure that it must look like that, it doesn't look pretty ☺

dibyendumajumdar commented 3 years ago

A sophisticated type system is difficult to do in the Lua parser which is designed for memory efficiency/speed. I haven't been able to figure out a way to do more sophisticated type system without breaking with the spirit of Lua.

I am working on a separate compiler that constructs AST etc and can be made to do more sophisticated types. But problem is this cant easily be translated to the default Lua parser

XmiliaH commented 3 years ago

It should mean default value, if none specified.

The I would recommend

function hello(s: string = "world")
   return 'hello ' .. s
end
snoopcatt commented 3 years ago

A sophisticated type system is difficult to do in the Lua parser which is designed for memory efficiency/speed. I haven't been able to figure out a way to do more sophisticated type system without breaking with the spirit of Lua.

Totally agree. I personally like Typedlua, it looks pretty and fully follows the Lua spirit:

local is_even, is_odd

function is_even (n: integer):boolean
  if (n == 0) then
    return true
  else
    return is_odd(n - 1)
  end
end

function is_odd (n: integer):boolean
  if (n == 0) then
    return false
  else
    return is_even(n - 1)
  end
end

print(is_even(8))
print(is_odd(8))
XmiliaH commented 3 years ago

@snoopcatt maybe change the type parsing to:

if (strcmp(str, "integer") == 0)
  tm = RAVI_TM_INTEGER;
else if (strcmp(str, "number") == 0)
  tm = RAVI_TM_FLOAT;
else if (strcmp(str, "closure") == 0)
  tm = RAVI_TM_FUNCTION_OR_NIL;
else if (strcmp(str, "table") == 0)
  tm = RAVI_TM_TABLE;
else if (strcmp(str, "string") == 0)
  tm = RAVI_TM_STRING_OR_NIL;
else if (strcmp(str, "boolean") == 0)
  tm = RAVI_TM_BOOLEAN_OR_NIL;
else if (strcmp(str, "any") == 0)
  tm = RAVI_TM_ANY;
else {
  /* default is a userdata type */
  tm = RAVI_TM_USERDATA_OR_NIL;
  typename = user_defined_type_name(ls, typename);
  str = getstr(typename);
  *pusertype = typename;
}
if (tm == RAVI_TM_FLOAT || tm == RAVI_TM_INTEGER) {
  /* if we see [] then it is an array type */
  if (testnext(ls, '[')) {
    checknext(ls, ']');
    tm = (tm == RAVI_TM_FLOAT) ? RAVI_TM_FLOAT_ARRAY : RAVI_TM_INTEGER_ARRAY;
  }
}
if (testnext(ls, '?')) {
  tm |= RAVI_TM_NIL;
} else if (testnext(ls, '!')) {
  tm &= ~RAVI_TM_NIL;
}

this would still have the old behavior to not break compatibility but allow to remove the nil type with !.

snoopcatt commented 3 years ago

It should mean default value, if none specified.

The I would recommend

function hello(s: string = "world")
   return 'hello ' .. s
end

All we need is to enable types annotation inside tables the same way as for local variables: t = {a: integer = 20}. Then we will be able to simply call hello{s: string = "World"}.

Looks like simple and elegant solution, but I have no idea what about technical side and why only local variables can be annotated for now

dibyendumajumdar commented 3 years ago

A sophisticated type system is difficult to do in the Lua parser which is designed for memory efficiency/speed. I haven't been able to figure out a way to do more sophisticated type system without breaking with the spirit of Lua.

Totally agree. I personally like Typedlua, it looks pretty and fully follows the Lua spirit:

I would say it does not - depending on what we mean by Lua spirit. I am using the term as it is used by C standard - spirit of C.

Lua's implementation is carefully designed to avoid heap allocation in the parser. It uses recursion and stack to create temp expression nodes.

Typed Lua is an add-on - so immediately add a lot of overhead and breaks the spirit of Lua in that sense.

You can look at new project https://github.com/teal-language/tl or Pallene - they may be what you are looking for.

XmiliaH commented 3 years ago

Looks like simple and elegant solution, but I have no idea what about technical side and why only local variables can be annotated for now

Annotating tables is quite hard and checking it at runtime too.

snoopcatt commented 3 years ago

I would say it does not - depending on what we mean by Lua spirit. I am using the term as it is used by C standard - spirit of C.

I meant only how it looks for the "end customer". Simple and elegant. Of course implementation looks like a hell, that's why I don't use it. Same for Teal.

snoopcatt commented 3 years ago

Annotating tables is quite hard and checking it at runtime too.

local a: integer = 1
local b: string = 'hello'

local t = {a=a, b=b}

why can't we say local t = {a: integer=1, b: string = 'hello'}? it shouldn't be much more expensive

XmiliaH commented 3 years ago

First: following the current type system it should be more like:

local a: integer = 1
local b: string = 'hello'
local t: {a: integer, b: string} = {a = 1, b = 'hello'}

Second: every access would need to check the types since you could pass it to a function with violates this constraints and writes an string to the key a. Third: a function parameter annotated with such a table expression would require to check every key to be valid. And you could have something like:

function f(t: {a: {b: {c: {d: {e: integer}}}}})
   return t
end
snoopcatt commented 3 years ago

First: following the current type system it should be more like:

local a: integer = 1
local b: string = 'hello'
local t: {a: integer, b: string} = {a = 1, b = 'hello'}

Second: every access would need to check the types since you could pass it to a function with violates this constraints and writes an string to the key a. Third: a function parameter annotated with such a table expression would require to check every key to be valid. And you could have something like:

function f(t: {a: {b: {c: {d: {e: integer}}}}})
   return t
end

I mean, why it is possible to annotate only local variables? Why can't we add do the same for "global" variables by its reference?

Or it is Lua design limitations?

XmiliaH commented 3 years ago

Globales are stored in the global table accessible with _G. If you could annotate the global variables how should this work over multiple ravi/lua files and direct access to the table. If you could annotate globales what would happen and when in the following example:

global a: integer; -- just assuming to annotate the global `a`
_G.a = "not an integer"
print(a)

should _G.a throw since the key a is annotated as integer or should print(a) throw since a is not an integer? In either of the cases additional runtime checks are required and no compile time errors could be generated. There is not much to gain from these annotations.

XmiliaH commented 3 years ago

@snoopcatt I just noticed that you forgot to change the JIT compiler in src/ravi_jitshared.c to handle your added and the changed opcodes correctly.

snoopcatt commented 3 years ago

should _G.a throw

this, same as for local variables:

➜  ~ cat t.lua 
local a: integer = 1
local b: string = 'hello'
a = "not integer"

ravi: t.lua:4: Invalid assignment: integer expected near <eof>

In either of the cases additional runtime checks are required and no compile time errors could be generated. There is not much to gain from these annotations.

yes, but we already have some runtime checks, on function calls for example. that's what about my question about cost was.

I didn't dive very deep into Lua, I only know basics, i.e. how to create C module etc. so I don't know much how process of re-assigning value is going at low-level.

I thought it is possible to bring some runtime checks at for example

static void assignment (LexState *ls, struct LHS_assign *lh, int nvars) 

to ensure new value is still correct type for that variable (we storing type of variable somewhere, aren't we?)

XmiliaH commented 3 years ago

(we storing type of variable somewhere, aren't we?)

The type for local variables is stored, however, the types of table values are not.

snoopcatt commented 3 years ago

The type for local variables is stored, however, the types of table values are not.

No more questions ☺

@snoopcatt I just noticed that you forgot to change the JIT compiler in src/ravi_jitshared.c to handle your added and the changed opcodes correctly.

Thanks. Just noticed that in jit_shared.c in TO_STRING opcode nil does not allowed:

static void emit_op_tostring(struct function *fn, int A, int pc) {
  (void)pc;
  emit_reg(fn, "ra", A);
  membuff_add_string(&fn->body, "if (!ttisstring(ra)) {\n");
#if GOTO_ON_ERROR
  membuff_add_fstring(&fn->body, " error_code = %d;\n", Error_string_expected);
  membuff_add_string(&fn->body, " goto Lraise_error;\n");
#else
  membuff_add_fstring(&fn->body, " raviV_raise_error(L, %d);\n", Error_string_expected);
#endif
  membuff_add_string(&fn->body, "}\n");
}

in oppose to lvm.c:

vmcase(OP_RAVI_TOSTRING) {
    if (!ttisnil(ra) && RAVI_UNLIKELY(!ttisstring(ra)))
        luaG_runerror(L, "string expected");        
    vmbreak;
}

Wut? Why? Why even if JIT enabled -- 1) on "patched" Ravi my tests working (even with ravi_jitshared.c unchanged) 2) on vanilla Ravi function(s: string) behaviour unchanged and allows to pass nil?

I think I'm totally lost here

XmiliaH commented 3 years ago

The same behavior can be seen for closure. I did open a separat issue https://github.com/dibyendumajumdar/ravi/issues/225 for this.

snoopcatt commented 3 years ago

Oh, I'm not (fully) lost my mind diving into it, it just a bug.

this would still have the old behavior to not break compatibility but allow to remove the nil type with !.

So, now we have to change behaviour of that anyway.

XmiliaH commented 3 years ago

In that case I would use

if (strcmp(str, "integer") == 0)
  tm = RAVI_TM_INTEGER;
else if (strcmp(str, "number") == 0)
  tm = RAVI_TM_FLOAT;
else if (strcmp(str, "closure") == 0)
  tm = RAVI_TM_FUNCTION;
else if (strcmp(str, "table") == 0)
  tm = RAVI_TM_TABLE;
else if (strcmp(str, "string") == 0)
  tm = RAVI_TM_STRING;
else if (strcmp(str, "boolean") == 0)
  tm = RAVI_TM_BOOLEAN;
else if (strcmp(str, "any") == 0)
  tm = RAVI_TM_ANY;
else {
  /* default is a userdata type */
  tm = RAVI_TM_USERDATA;
  typename = user_defined_type_name(ls, typename);
  str = getstr(typename);
  *pusertype = typename;
}
if (tm == RAVI_TM_FLOAT || tm == RAVI_TM_INTEGER) {
  /* if we see [] then it is an array type */
  if (testnext(ls, '[')) {
    checknext(ls, ']');
    tm = (tm == RAVI_TM_FLOAT) ? RAVI_TM_FLOAT_ARRAY : RAVI_TM_INTEGER_ARRAY;
  }
}
if (testnext(ls, '?')) {
  tm |= RAVI_TM_NIL;
}

This allows a consistent use of ?. However, this would likely require the type checks for userdata and arrays to be adjusted too.

snoopcatt commented 3 years ago

I'm afraid it won't work. Because we also need to adjust type checks in lvm.c and ravi_jitshared.c, and functions there must know some way, is nil acceptable or should we throw an error.

I've done it by just copy-pasting opcodes and appending _NIL to them, it works, but it's dirty solution, I'm sure there are more graceful way to do that.

I've imported my working repos to sh*thub (my god, it still can't mirror git repos, in 2021 year) Its here, at least you may see what files are changed.