pandorabox-io / in-game

Random code and stuff for in-game things
MIT License
3 stars 0 forks source link

Experimental async_controller (luac) #371

Closed TheEt1234 closed 3 months ago

TheEt1234 commented 6 months ago

Repo: https://github.com/TheEt1234/async_controller/tree/main

Problems:

Ratelimiting

basically solved, just need to figure out what to do if someone spams like 50 of them in a mapblock... i think mesecons_debug can figure that one out it can't

And also needs more testing

Discussion (has some important context)

starts from https://discord.com/channels/513329453741637637/719950700485935145/1214324327185649684

For those without access to Discord, the discussion mentioned above, has been added to this thread.

birdlover32767 commented 5 months ago

so here is how you add worldpath trimming to the formspec:

local _,wend = errmsg:find(":%d*:")
if wend then -- erase worldpath
    errmsg = "..."..errmsg:sub(math.max(0,wend-15),#errmsg)
end

currently it trims down to .../sandbox.lua:80: because of the 15-letter trim, but you can change the 15 if you wanna

WARNING: putting 2 colons with only numbers in between (like :12345:) may cause issues

TheEt1234 commented 5 months ago

so here is how you add worldpath trimming to the formspec:

local _,wend = errmsg:find(":%d*:")
if wend then -- erase worldpath
  errmsg = "..."..errmsg:sub(math.max(0,wend-15),#errmsg)
end

currently it trims down to .../sandbox.lua:80: because of the 15-letter trim, but you can change the 15 if you wanna

WARNING: putting 2 colons with only numbers in between (like :12345:) may cause issues

Well it's a little too late because i have made a custom traceback that only offers the relevant info

TheEt1234 commented 5 months ago

Soo uhh, things to do tomorrow i guess:

  1. in game documentation and examples, make it friendly towards new users, even new users to programming (basic lua guide :p?)

Actually, i think docs.md honestly works just fine

  1. terminal input

i am lazy touchscreens are for that

SwissalpS commented 5 months ago

3. terminal input

~i am lazy~ touchscreens are for that

also keyboards.

You could copy from mooncontroller, that sounds like a lazy solution ;)

TheEt1234 commented 5 months ago

image fair i should stop being lazy

also i've realised that pcall can theoretically catch errors from modify_self, i don't know why i've made it error in the first place

should i just remove it in favour of env_plus's safe loadstring

oh right for that i would need to actually test env_plus propertly

TheEt1234 commented 5 months ago

The performance of

local f = loadstring(code)

pcall(f())

is a little concerning (takes 50 ms, i wonder how actually)

Should i care about this (like maybe idk make pcall throw an error if things go this badly)

TheEt1234 commented 5 months ago

But besides that case, the time limit does its job well when the env_plus functions are being pushed to their limit... except uhh... the minetest.compress/decompress functions might have to go, because a async_controller can spam the terminal with them, and even the chat

TheEt1234 commented 5 months ago

well, i've committed the update to the repo

TheEt1234 commented 5 months ago

also i forgot to mention, minetest.sha256 is going away too as i think thats a 5.9.0 thing(?)

S-S-X commented 5 months ago

is a little concerning (takes 50 ms, i wonder how actually)

Is pcall hard requirement in async environment. Like I've no idea but it is possible it might not be... of course not having it would also mean dropping some good features.

TheEt1234 commented 5 months ago

is a little concerning (takes 50 ms, i wonder how actually

Is pcall hard requirement in async environment. Like I've no idea but it is possible it might not be... of course not having it would also mean dropping some good features.

what do you mean by hard requirement and async environment

i just thought pcall would be a cool feature and i saw a mesecons pull request that added a safe version of it (the problem with unsafe pcall was that you could evade the "Code Timed Out" errors and the timeout hook would no longer work once that error got caught)

also now i am pretty sure that 50ms is caused by all of the safe pcalls error'ing because code timed out

and i am pretty sure it's fixable by not allowing you to basically chain pcalls infinitely

S-S-X commented 5 months ago

what do you mean by hard requirement and async environment

I mean main reason for pcall for luac programs is to not crash whole Lua VM (to bring down minetest server) with luac program error. Async environment is completely separate and runs under its own VM so pcall might not be that hard requirement. Unless it somehow propagates errors to main Lua environment.

I was thinking would it be possible to disable all pcalls when code is about to get executed in async env / separate vm.

TheEt1234 commented 5 months ago

what do you mean by hard requirement and async environment

I mean main reason for pcall for luac programs is to not crash whole Lua VM (to bring down minetest server) with luac program error.

The sandbox needs to use pcall for the user code

Async environment is completely separate and runs under its own VM so pcall might not be that hard requirement. Unless it somehow propagates errors to main Lua environment.

I was thinking would it be possible to disable all pcalls when code is about to get executed in async env / separate vm.

No minetest would crash if there's an error in the async environment

TheEt1234 commented 5 months ago

also, running 200 async_controllers (yes literally 200) with the code:

interrupt(0)

<uhh this code would actually make the normal luacontroller OOM or freeze the server, but async_controller has a proper time limit alongside an instruction limit so it won't happen, i don't want to leak it here, if you want to know this see one of the mesecon issues i've linked>

My fps started jittering like crazy and cpu usage was like at 100% but max lag was still at 0.2~0.3

But the real problem could arise when serializing memory or doing massive amounts of interrupts/digiline messages because that gets done after the execution and isn't ratelimited

So the serializer abuse bug [that i cannot reproduce] could probably be used in the async_controller

S-S-X commented 5 months ago

Any interrupt stuff does not have to be fixed because there's configuration for its behavior.

Anything you can just push to queue can create issues, for interrupt you can disable queue so for 200 luacs it is always at most 200 active interrupt any given time.

S-S-X commented 5 months ago

For serialization issues make sure you validate asynchronously and only return verified good results that have been already serialized. That takes overhead away from main thread and amount of data can be counted toward possible per luac execution limits.

TheEt1234 commented 5 months ago

Any interrupt stuff does not have to be fixed because there's configuration for its behavior.

Anything you can just push to queue can create issues, for interrupt you can disable queue so for 200 luacs it is always at most 200 active interrupt any given time.

the issue wasn't with interrupts but the fact that i have interrupts + laggy code

this issue can be solved by setting the execution time limit 5ms instead of 10ms for example, or like tracking down the lag machine (i have made a basic tool to track them down)

TheEt1234 commented 5 months ago

For serialization issues make sure you validate asynchronously and only return verified good results that have been already serialized. That takes overhead away from main thread and amount of data can be counted toward possible per luac execution limits.

that's what the luacontroller does in the remove_functions function (functions cannot be serialized and will crash minetest)

the current problem is that the serialization can simply just take too much time, this problem can actually be solved by imposing a time limit on that as well

wait i have done that before, i just threw the solution out because i thought it was un-needed, oops :p

i don't know how to reproduce it because the steps to reproduce that issue are private

S-S-X commented 5 months ago

i have made a basic tool to track them down

Additionally it would be good to integrate reporting for mesecons_debug mod

TheEt1234 commented 5 months ago

i have made a basic tool to track them down

Additionally it would be good to integrate reporting for mesecons_debug mod

I can't figure out how this mod works, but the only thing that my mod (and same with all the other controllers i think) seems to lack support for is their node timer override

also my "basic tool" is just logging where an event happened and how many ms it took

SwissalpS commented 5 months ago

Have you tested/considered how this node works when e.g. jumped with jd?

S-S-X commented 5 months ago

I can't figure out how this mod works, but the only thing that my mod (and same with all the other controllers i think) seems to lack support for is their node timer override

It actually collects data about any mesecons/digilines activity everywhere and adjusts delays for queues based on usage per mapblock. For this to work it somehow needs information about how long program execution actually took but it does not take into account that no main thread time were used.

For other synchronous devices / wires / whatever stuff it is easy for debug mod to gain this information, anything asynchronous however will hide this information so it can't really know anything about async luac unless there's some kind of reporting from async luac to debug mod.

TheEt1234 commented 5 months ago

I can't figure out how this mod works, but the only thing that my mod (and same with all the other controllers i think) seems to lack support for is their node timer override

It actually collects data about any mesecons/digilines activity everywhere and adjusts delays for queues based on usage per mapblock. For this to work it somehow needs information about how long program execution actually took but it does not take into account that no main thread time were used.

For other synchronous devices / wires / whatever stuff it is easy for debug mod to gain this information, anything asynchronous however will hide this information so it can't really know anything about async luac unless there's some kind of reporting from async luac to debug mod.

image

When activated through the "execute" button, the mystery code i am not gonna show, but it seems there has been something done about it(?) since it says load:4: string length overflow but still caused a big lag spike, but mesecons_debug didn't seem to care (at least from /mesecons_hud)

BUT when activated by another luacontroller with the code:

if event.type=="program" then

digiline_send("","")
end

It noticed something was fishy and registered a usage of 110 440 us/s

So maybe it only monitors digilines and digiline/mesecons/luac related node timers

So... should i just add the mesecons_debug's node timer override if mesecons_debug is avaliable (mooncontroller might have the same issue)

birdlover32767 commented 5 months ago

image

please trim the worldpath!!!!!!

TheEt1234 commented 5 months ago

image

please trim the worldpath!!!!!!

that image is blank for me, but i think you were reffering to timeout errors... well... the mod path can be useful for identifying what exactly caused the timeout

TheEt1234 commented 5 months ago

I kinda want to make something where a user can make a library, and an async_controller can import it

should i do this and how should i go about doing it, and where should the libraries be stored

SwissalpS commented 5 months ago

look at mooncontroller and the open issues. Basically the libraries are mods that register themselves with the controller.

Re: wordlpath: everything before the mod-dir can be trimmed: /some/path/to/mods/mod/file.lua -> mod/file.lua

SwissalpS commented 5 months ago

or if you mean libraries like advtrain supports (user generated) modstorage would be a modern option for storing them. That could potentially allow all users to use each others code. Depending on the complexity you want to add, you could have a system with privs for reading and writing global code and you could also allow users to mark their code as private.

birdlover32767 commented 5 months ago

look at mooncontroller and the open issues. Basically the libraries are mods that register themselves with the controller.

Re: wordlpath: everything before the mod-dir can be trimmed: /some/path/to/mods/mod/file.lua -> mod/file.lua

so basically trim everything before the 2nd-to-last / that requires you don't use slashes as error messages

TheEt1234 commented 5 months ago

so basically trim everything before the 2nd-to-last / that requires you don't use slashes as error messages

how do i do that and make it so that like... uhh.. when you do error("/funny/heh") it doesn't get edited

SwissalpS commented 5 months ago

one way is to use local mp = core.get_modpath(<yourmodname>) then local base = mp:sub(1, <total length minus length of your modname>) then replace occurances of base with ""

TheEt1234 commented 5 months ago

one way is to use local mp = core.get_modpath(<yourmodname>) then local base = mp:sub(1, <total length minus length of your modname>) then replace occurances of base with ""

oh cool, that works

TheEt1234 commented 5 months ago

also, now that i think about it adding user libraries might be a bad idea (because i feel it would be too complex to implement)

so i guess now i should focus on like... security and improving the source code

TheEt1234 commented 5 months ago

So i've rewired/rewritten the async_controller mod a bit, namely to use minetest.register_async_dofile instead of passing around an async_env variable

Also breaking changes:

Debugging improvements: you can now see how much the callback took (aka the part that processes all the digiline_sends, interrupts and saving memory) if you have enabled it in the source code

uhh holdon saving memory appears to crash (?) maybe i broke something recently again nevermind just a really really dumb mistake, fixed

TheEt1234 commented 5 months ago

(Need to fix a really silly crash bug, eta: tmrw)

fixed lol, why did i wait 3 weeks to post that update

TheEt1234 commented 4 months ago

should i rewrite it to use my sandboxing mod

TheEt1234 commented 3 months ago

looking back at this - i should probably rewrite it to use my sandboxing library but i am feeling lazy - so maybe later