naev / naev

Naev is a 2d action/rpg space game that combines elements from the action, rpg and simulation genres.
https://naev.org
Other
850 stars 201 forks source link

[proposal] static checking for Lua #1566

Closed bobbens closed 2 years ago

bobbens commented 3 years ago

One of the issues we get is lots of silly typos and mistakes in missions. As the number of missions grows, it's not feasible to check everyone manually all the time. For this purpose it would be nice to try to set up luacheck to do some static Lua analysis and could be possible integrated into

ProjectSynchro commented 3 years ago

We could implement this as a test in Meson maybe, should be reasonably easy to do. That or it could be integrated into CI externally

bobbens commented 3 years ago

Apparently we would have to either generate or create a file like https://github.com/mpeterv/luacheck/blob/master/src/luacheck/builtin_standards/love.lua . Maybe it would be possible to generate it from the luadocs stuff? That way would be minimum overhead.

ProjectSynchro commented 3 years ago

luacheck is added as a basic Meson test as of 07c466119ececcbe17e1b8aa490eeef0d0b487a1

As for defining what's what in the configuration file.. I'm not fully sure about the what/how. See: here for the documentation about defining stuff.

atm there are only some basic built-in stds defined. (which I think are correct but I could be wrong..)

bobbens commented 3 years ago

Ideally we would want to generate all the functions and stuff from our documentation (so we don't duplicate efforts). Most of the time we also use @luatparam and @luatreturn, so hopefully it could be made to handle it all elegantly. When does this check get run?

UncombedCoconut commented 3 years ago

At this point, the check is more working than before. It runs anytime you run meson test -- the test/meson.build defines a suite of 3 tests and calls this one lua_lint, so meson test lua_lint is the command to run this individual test. However, this is all bypassed unless your system has luacheck.

Running `cd ${NAEV_SRC}/dat; luacheck . gives the same results as of today.

The configuration lives in dat/.luacheckrc. It doesn't work like the doc scraping idea above. We could get closer to doing things that way, but there are warning signs...

What I did and a doc-scrape-like approach look like two local optima, actually, as a few of these concerns are tied together. If we want function-level granularity (var.peek/var.pop/var.push instead of just var), a manual config is unsustainable. But then we need to pass it a dynamic config, so we can't browbeat luacheck into making naev/dat into an "anchor dir", so we can't use glob patterns (because the stupid tool will test <srcdir>/dat/<scriptpath> against <cwd>/<globpattern>), so we need one set of standards/globals for all Lua scripts.

All of this might be a symptom that Lua env management has gotten overcomplicated over time.

BTW, as far as I can tell, the @luatparam/@luatreturn directives can't be used. luacheck has no understanding of function signatures. It just validates you've looked up a name/field that could plausibly be a function.

UncombedCoconut commented 3 years ago

bit of a side issue: turning on luacheck's knowledge of the love2d API seems to bite us in the ass. Anytime we extend the love namespace (usually with some helper function or state we're using to implement love2d in the first place), it screams bloody murdrer. It might be better to turn it off. Without the rules, luacheck sees a statement like local love = require 'love'and says, "this love variable is a table, the way it was initialized looks valid, and so any attempt to look up a field and call it should pass validation too." All analysis is local. It has no idea what the operand of require refers to or what's defined inside. Have I mentioned that this tool isn't very smart? On the bright side, it's better than me at catching syntax errors.

UncombedCoconut commented 3 years ago

Here are some of the lints that are hard in principle to fix. (Most aren't hard in principle, just extremely numerous.)

Pasting from Discord, here are common cases where we probably agree with the linter:

We can probably close this issue when we're ready to use the linter in CI. I'm really not sure how that's going to work.

bobbens commented 3 years ago

Pro: although it's an error to access gfx functions from missions (for instance), we probably don't care about the linter catching that and would be happier with a simpler rule list.

This is unrelated, but I do think we might as well just give up on the sandboxing I was trying to do originally (not really respected and half-implemented), and just let everyone use all the API, leaving it up to the script developer to not be a bad person. That said, we should make it harder to break things and impossible to rm -rf /, but it's probably easier to do.

Pro: if we can auto-detect the nlua_var.c API as {read_globals={var={fields={"peek", "pop", "push"}}}} (whereas I currently have {read_globals={"var"}}), the linter can catch misspelled function names.

I do think this is fairly fundamental.

Con: It turns out there's a lot of shit wired up directly via lua_register, nlua_getenv, nlua_setenv, and nlua_refenv*. These things are critically important and totally invisible to the doc system.

I guess more comment metadata so that luadoc generates this stuff would make sense.

It's a hard-coded rule that unused variables are supposed to be named . It's a beautiful convention, except we want to refer to gettext and never be shadowed. This leaves us with hundreds of unused loop variable warnings that are tough to fix. This happens in other places where we call a multi-return function and don't want all the values.

Might make sense to just disable this warning.

Right now I think you're running it directly on the files, but maybe it would be easier to just generate some sort of build directory with the generated documentation files + all the dat on some sort of normalized paths so that luacheck plays well with it? Probably wouldn't even need translating filenames.

For issues like _ being hardcoded and stuff, either opening an issue or providing a patch to the luacheck project should work.

UncombedCoconut commented 3 years ago

We're mostly on the same page, but this comment might be a sign that some clarification is needed.

Right now I think you're running it directly on the files, but maybe it would be easier to just generate some sort of build directory with the generated documentation files + all the dat on some sort of normalized paths so that luacheck plays well with it? Probably wouldn't even need translating filenames.

This might solve the generated-config problem, and help us split the lua codebase between dat/ and artwork/ if we want, but the generated .lua files are probably useless to luacheck. It's a local analyzer, and it may even be limited to one pass through each file. If it sees a require statement, it doesn't know or care what package.path should be and what the target should contain. It's more like:

Because the analysis is always local, it almost doesn't matter if luacheck handles file paths correctly. It's directives like files["ai/**/*.lua"].std = ... that break. If we did away with all sandboxing, one global std might suffice. The challenge is getting the luacheckrc to populate a gigantic stds table with all globally-scoped symbols that Naev might import or export. The doc generation sees a very large slice of the required data (though not in the form luacheck needs) because each @luamod/@luafunc pair identifies a global/field. Then we detect that mem might already be a global, that function create() is not evil without a local keyword, etc.

bobbens commented 3 years ago

I see. That does seem fairly bad. So at the end of the day it's going to be some sort of glorified spellchecker or something like that?

UncombedCoconut commented 3 years ago

Update: We're down to 3975 Luacheck warnings. Luacheck has successfully found bugs, though it can't check types or arguments or even most names below package level.

Meta TODO:

Lint TODO:

UncombedCoconut commented 3 years ago

The checklist of fussy issues can actually be decoupled from the issue of enforcing Luacheck-clean commits, because there's a way to annotate exceptions to the rules: https://luacheck.readthedocs.io/en/stable/inline.html In some of these cases, a -- luacheck: globals directive may be the solution.

Adding formulaic comments as follows:

-- luacheck: globals cargo_land commchoices (from base mission neutral.commodity_run)
-- luacheck: globals enter land leave pay_text success_text (shared with derived missions flf_dvk01, flf_dvk04)
-- luacheck: globals enemies enter land (Hook functions passed by name)
-- luacheck: globals approach_ceo approach_kex (NPC functions passed by name)
-- luacheck: globals equip_generic (From "factions.equip.generic")
bobbens commented 2 years ago

This does seem like it's mainly solved, although some of the checklist stuff hasn't been done. Do we close and create new issue for remaining cases or just shove it under the table?

UncombedCoconut commented 2 years ago

Probably a mix -- I'm not really expecting every unchecked item to be worth doing. Are there remaining checklist items that make you think, "that sounds even worse than the luacheck: directives?" Let's sweep those under the rug and split out the rest. :)

bobbens commented 2 years ago

I'm guessing this is probably worth closing and maybe open issues for the remaining things if we feel motivated to do them.

bobbens commented 2 years ago

I'll close and let #2149 supersede this.