Hammerspoon / hammerspoon

Staggeringly powerful macOS desktop automation with Lua
http://www.hammerspoon.org
MIT License
11.99k stars 582 forks source link

LuaSkin and throwing a lua error #698

Closed asmagill closed 7 years ago

asmagill commented 8 years ago

Please someone tell me I'm wrong, but I think we're going to have to rethink/refactor a large chunk of LuaSkin. I know it's a little long, but please bear with me...

Lua's error functions (any of luaL_check*, luaL_error, luaL_argerror, and lua_error) do not return. In the context of the console or when running any lua code in the Lua state, this isn't a problem, as the whole environment is written around knowing this.

But we also use LuaSkin in Objective-C delegates, Objective-C blocks, etc.

To give an example of what I am fearing this means, imagine the delegate for webview... the user clicks on a link, and for whatever reason it fails... this triggers a call to the WKWebView delegate for the webView:didFailNavigation:withError: message, which, if defined, calls a lua callback function. Since I wrapped that in protectedCallAndTraceback:nresults: an error in the callback doesn't matter, but what does matter is that the delegate method looks at the results returned by that callback, and if it is a string, assumes it's HTML and displays it as the error page...

Actually, this is a bad example because I don't do what I'm about to describe, but it was the first delegate I could think of in enough detail without pulling up the source code...

If I were to verify that this was a string with luaL_checktype(L, -1, LUA_TSTRING) or if for example, I was expecting a rect table instead of a string and used [skin tableToRectAtIndex:] and the table was malformed, luaL_check* functions in LuaSkin would, if I understand how this works correctly, never return to the delegate method so whatever else the delegate has to do to cleanup after itself wouldn't happen... I don't know if this will lead to a crash, necessarily, but I fear it could in some cases.

Sometimes in our C code, we call a lua c-function, which currently LuaSkin can't wrap, so if I don't use lua_pcall for these (and we don't in a couple of places I came across recently) then an error thrown by [skin checkArgs:] could also cause this...

Basically, if I'm understanding the problem correctly, any use of a lua error function in LuaSkin is potentially vulnerable to causing this kind of situation.

Please tell me I'm missing something or mis-understanding something.

Modifying the tableTo* methods and toNSObjectAtIndex: and luaObjectAtIndex:toClass: methods to silently return zeroed structures or NSNull is one option to fix those, but I like that they throw errors when used within the console or in the evaluation/execution of lua files.

Another option is to add a flag to every LuaSkin method that indicates whether terminating errors can be thrown or not.

It still doesn't address checkArgs: or other places I may be not thinking of right now...

And I'd really rather not have to have two separate methods for everything, one which throws errors and one that doesn't... use the wrong one in the wrong place and it will work most of the time (when data is correct or at least coercible) and be a real pain to debug.

Am I worrying about nothing? Or missing a simpler fix?

cmsj commented 8 years ago

Part if the point of checkArgs was to enable us to slowly move away from the module functions having a spray of luaL_check calls in the middle of their work. Get the data validated and fail immediately.

I think the same sort of thing applies here. You're right that the _check calls would not return into a block/callback. In the case of type checking, lua_type() is generally fine (userdata metadata typing is about the only thing it can't do afaik), it's switch()able and lua_to...() exist.

Perhaps checkArgs should have a sibling checkStack that verifies types from the top of the stack down, taking care to return an error instead of longjmp()ing into an error message?

asmagill commented 8 years ago

Thinking on it some more, I will clarify that if you push a c-function with lua_pushcfunction, then it can be wrapped with the LuaSkin protectCall method.

So I guess the approach I'm favoring at the moment is as follows (not necessarily in order of importance):

  1. look at modules for use of lua_call and calling lua c-functions directly (i.e. without using lua_pushcfunction), generally anywhere, but most importantly in portions of the code which are invoked by the operating system directly (delegates, etc.). Using a bare lua_call isn't necessarily wrong, but should only be used when you can be absolutely positive that it will never be executed by code that isn't wrapped either by the Lua interpreter itself or an outer lua_pcall (e.g. code entered into the Hammerspoon console).
  2. update all conversion functions within LuaSkin to either return a zeroed structure or nil when the data is malformed rather than use an error function.
  3. verify that the same practice as point 2 is used in all new conversion functions added to LuaSkin via the register* methods of LuaSkin... this will have to be an ongoing "best practices" thing, since these are not a static part of LuaSkin but are adjusted/added by modules as people use/need them.

Item 2 means moving some of the data validation back to the module/code using the conversion functions which is why I favored putting it into LuaSkin to begin with, but it will make LuaSkin significantly safer to use when invoked directly by the OS. It also means that a poorly written delegate that doesn't check to see if a result is nil/Null might trigger an exception rather than print an error to the Hammerspoon console... but at least the exception will be in a specific place in a specific module right where it happened, rather than perhaps crashing somewhere potentially unpredictable because a delegate didn't finish or return when expected and something got left in an unpredictable state.

Should a warning at least be printed to the console? I go back and forth on this...

Have I missed anything?

cmsj commented 8 years ago

+1 for items 2 and 3. I think 1 is going to be something of a case-by-case discussion, but yeah, we should look over those parts of the code and check for dangerous calls.

I think a console warning probably is appropriate - we're informing a user that some of their code is returning junk in a place where it is not supposed to do that. If they don't care, they can ignore the warning ;)

cmsj commented 8 years ago

(btw, thanks for this issue, it hadn't occurred to me at all, but it's an important step in the directions of quality and stability. I hope we can live up to it in future code reviews!)

asmagill commented 8 years ago

Ok, I'm currently making some progress with an axuielement query and browse tool that I've wanted for a while, and I don't want to lose my inertia so to speak, but I'll start tackling this later in the week since the biggest areas of change are in the conversion methods and I originally added most of that code.

I may also take a look at checkArgs while I'm at it; on the surface, I don't think we need LS_TNONE at all (it's not really a "parameter" -- LUA_TNONE is a way of testing if your index is out of bounds (i.e. a parameter is not at that index), and we allow that with LS_TOPTIONAL and delineate the absolute max with LS_TBREAK, so I don't think the parallel LS_TNONE is actually usable with checkArgs in any sense, but I want to think about it some more. And I have some ideas on tweaking the userdata checks that might let us be a little more flexible and specify multiple types or even no specific type at all without falling back on LS_TANY.

asmagill commented 8 years ago

Re: LS_TNONE, I'm tired of mistakenly doing checkArgs:LS_TNONE, LS_TBREAK, having it compile, but then refuse to work because it expects 1 parameter of type none and having to recompile for such a stupid mistake 😁 If there is a specific use case you can think of that I'm missing, let me know and I'll make sure it works with whatever changes I do end up proposing.

asmagill commented 8 years ago

Starting work on this... some thoughts in case anyone has ideas or comments about things I've missed or better solutions/approaches...

  1. The purpose of checkArgs: is to validate parameters being passed into to a c-function. As such, a function which uses this method is expected to be invoked through lua_call(k) or lua_pcall(k) either explicitly or from the lua interpreter/state/whatever-terminology-is-more-correct. So, it is my suggestion that this continue to use the _error methods and decide that "proper Hammerspoon coding practices" requires invoking all c-functions within a delegate/call-block either through LuaSkin's protectedCallAndTraceback or lua_pcall. Failure to do so should mean holding off on adding a module to the core release until it is addressed -- you can ignore the error in your delegate/call-block/etc. if you choose, but at least the delegate, etc. will complete cleanly rather than be interrupted. This does imply the need for a review of the existing modules for correctness, something I'll start while making the necessary changes to other areas of LuaSkin.
  2. The registered functions for registerPushNSHelper:forClass: and registerLuaObjectHelper:toClass: must be unique for a given objective-c class name. For this reason, they throw an error if the specified class already has a registered function. Since the registration is expected to only occur within the luaopen_* function when a module loads the first time (and only the first time -- subsequent require's do not re-run this function), this should generally be fine as well...
  3. all other error throwing in LuaSkin should be removed and should return nil or a zeroed structure, whichever is more appropriate.

This is more or less my intended approach to the cleanup.

I do see one possible gotcha with point 2 because I use require (or more accurately LuaSkin's requireModule) in a couple of places to push the "table" for a module onto the stack and then select a function to call from that table, rather than expect to find the module at a specific path off of global (e.g. _G.hs.module.function)... This could potentially be an issue if the module has truly never been loaded before, but with the regular practice of loading all required modules at least once in the parent init.lua, while triggering this issue is possible, it is highly unlikely.

I'm still mulling over the implications of possible changes to requireModule (e.g. silently ignore (well, complain to the console but otherwise ignore) attempts to multiply register functions for a class -- this is only one reason a require could fail, though; push nil or an empty table, if the "require" fails... this would push back the need to verify the results to the caller of requireModule so it could handle the error as it sees fit).

In the end, we can't completely erase the possibility of bad behavior, we can just protect against most of it and insist on best-practices, at least in core, to (hopefully) make this particular type of problem one you have to go out of your way to create, rather than an easy unexpected consequence.

asmagill commented 8 years ago

FYI, I'd also like to lay the groundwork for getting rid of hammerspoon.h completely, so I'm adding methods to LuaSkin for printing to the console and invoking showError.

The only other things still defined in hammerspoon.h are:

 extern lua_State* MJGetActiveLuaState();
 #define lua_to_nsstring(L, idx) [NSString stringWithUTF8String:luaL_checkstring(L, idx)]

The lua_to_nsstring can actually be replaced already, and my reasons for initially wanting MJGetActiveLuaState have gone away... does anyone else see a reason for it? At any rate, there are a number of places where the existing functions are still used, so I won't remove it yet, but I want to be able to at some point in the future...

cmsj commented 8 years ago

@Asmagill I'd like to keep Hammerspoon specific stuff out of LuaSkin, which I think we've managed to do pretty well so far.

I do agree that hammerspoon.h is horrible though. My plan was to eventually get around to a Hammerspoon.framework for extensions to get an API bundle

cmsj commented 8 years ago

Sorry, I didn't address your actual questions. I'm happy to kill the other two things, both of which LuaSkin covers nicely

asmagill commented 8 years ago

Well, right now that leaves CLSLog, printToConsole and showError...

And I'm adding what could be a simple replacement for printToConsole as a replacement for throwing errors in LuaSkin.

If you're serious about creating the Hammerspoon skin (the idea has been around for a while 😀 ), then I'll hold off on the rest, and we can maybe move the print stuff out and into the Hammerspoon skin later (or duplicate the code, whichever ends up being cleaner).

asmagill commented 8 years ago

(Hammerspoon framework, not skin... sorry!)

cmsj commented 8 years ago

@asmagill it does seem a bit silly to have a framework for a #define and a couple of functions. Perhaps the answer is to slightly change the names so they are app-agnostic and then have them be implemented as a delegate protocol in LuaSkin, so the actual code would still live somewhere on Hammerspoon?

cmsj commented 8 years ago

So maybe CLS_NSLOG becomes logDebug, printToCondole becomes logWarn and showError becomes logError. IMHO those are pretty much what they mean anyway, and it hands off the decision of what to do with those things, to somewhere else

cmsj commented 8 years ago

Then MJLua.m can add itself as the delegate for LuaSkin and carry the actual implementations.

asmagill commented 8 years ago

FWIW, these are the proposed print additions (I'd likely want something similar for showError)

- (void)printNSStringToConsole:(NSString *)theMessage ;
- (void)printNSStringToConsole:(NSString *)theMessage fromLevel:(int)theLevel ;
- (void)printStringToConsole:(const char *)theMessage ;
- (void)printStringToConsole:(const char *)theMessage fromLevel:(int)theLevel ;
- (void)printToConsoleAtIndex:(int)idx ;

The fromLevel versions use luaL_where to attempt to get a location to print with the message. The String, NSString, and AtIndex versions are just to simplify usage to what is easiest wherever they are needed.

asmagill commented 8 years ago

The renaming works for me... but see my (made before seeing your delegate method name suggestions) previous post for the versions of each that might make sense.

cmsj commented 8 years ago

I don't mind if there are multiple variants, so your example could be something like:

- (void)logWarnWithNSString:(NSString *)theMessage;
- (void)logWarnWithCString:(const char *)theMessage;
etc.

I would not hate that at all, I always half-forget that the current functions take C strings anyway.

asmagill commented 8 years ago

Yeah, that's bit me a couple of times as well, especially with the register methods... at first I thought it would be simpler since Lua deals in C-strings, but when using obj-c methods, I get on a role and...

And I like sticking the C in there instead of just leaving out the NS... makes it clearer.

Ok, for iteration 1, I'm going to stick with the methods as part of LuaSkin so I can at least start my local testing today... I've never added delegate support to a class before, so I'm going to have to do some reading tonight; but I do agree your ideas are a better solution overall.

cmsj commented 8 years ago

:)

delegates are pretty easy - conceptually you're just storing a reference to an object inside LuaSkin instances, that it will make method calls on, so LuaSkin's logFooBarBaz: would just contain [self.delegate logFoobarBaz:theMessage]. It can be dressed up in fancy clothes with a formal @protocol, which is probably worth doing, but really it's just forwarding methods to a different object than the one they are called on.

asmagill commented 8 years ago

Another thought while I working on this...

[LuaSkin toNSObjectAtIndex:i] looks at a lua string and determines if it can be represented as a UTF8 string or not... if so, it returns an NSString; otherwise it returns an NSData to preserve the lua string contents.

This makes sense in things like hs.settings which can seamlessly handle both types transparently but I have notice it cause some problems with other things that can't handle NSData if the return type isn't first checked with [obj isKindOfClass:[NSString class]].

Do we want to add a toNSStringAtIndex: which will only return an NSString? Should it return an empty string when it's not valid UTF8, or run it through hs.cleanUTF8forConsole and return as close a representation as possible with invalid UTF8 sequences converted to the Unicode Invalid character?

It would be easy enough to add along with these changes.

cmsj commented 8 years ago

I almost wonder if we want a toNSObjectWithOptions, but I'm fine with toNSString and I vote for it filtering out invalid Unicode.

asmagill commented 8 years ago

Off the top of my head (I'm heading to sleep shortly, but I'll look closer tomorrow), I can think of 2 options to toNSObjectAtIndex: string only (as proposed here), and allow-self-referential tables

For pushNSObject, I think the only option would be whether unsigned long long should preserve the "number" (i.e. treat as a lua number) or preserve bits (treat as a lua integer, ignoring the unsignedness)

Am I missing anything?

I'm fine with adding "withOptions:" and defining some bitflags which can be or'd together... if we come up with many more variants, it's about the only way that doesn't lead to method explosion.

I think you and I are the primary users of these methods, so the re-coding ramifications should be pretty small/tractable, and it will easily scale without breakage later if, as suggested above, we come up with more options in the future.

And when you say "filter out", do you mean remove completely, or convert invalid sequences into the valid sequence for the Unicode Invalid character? I prefer the latter myself -- it makes it obvious where I'm either truncating a string too early or that I have binary data in the string.

cmsj commented 8 years ago

I meant the latter. A lossy conversion to Unicode

asmagill commented 8 years ago

While talking LuaSkin changes and updates, objections to me adding LS_TINTEGER to the types checkArgs validates? Or should it be an option like LS_TOPTIONAL (e.g. use LS_TNUMBER | LS_TINTEGER, instead of just LS_TINTEGER)?

asmagill commented 8 years ago

Also, since I'm in the guts of LuaSkin and making tweaks/changes:

Proposed changes/updates to LuaSkin and conversion for standard data types (notes, some of the default behaviors described below is not what is currently done, but what I am proposing to change things to):

Options may be combined by logically or'ing them together

I'm a little uncertain if the default of preserving the "number" over the bits is a better default or not... it seems more in the spirit of "as accurate a copy as possible", but in truth, from a coding perspective, bits are usually more important than the numerical value when using unsigned long long - I see this type a lot in bit flags in the OS X SDKs, but that's more my domain than numerical analysis and statistics... thoughts?

  • - (id)toNSObjectAtIndex:(int)idx
  • - (id)toNSObjectAtIndex:(int)idx withOptions:(unsigned int)options
  • LS_NSPreserveLuaStringExactly - By default, toNSObjectAtIndex: will run a string which contains invalid UTF8 byte sequences through hs.cleanUTF8forConsole to insure it is treatable as an NString. Use this option to have Lua strings with invalid UTF8 byte sequences converted to NSData instead (strings that are fully UTF8 compliant will still return NSString)
  • LS_NSLuaStringAsDataOnly- Use this option to cause all Lua strings to be converted to NSData objects, skipping NSString completely
  • LS_NSAllowsSelfReference - By default, tables which contain a self-reference (i.e. a table key or value of the table which contains it) are not allowed (they break things like NSUserDefaults, NSJson, etc.) and will return nil for the entire conversion (no matter how nested the self-reference is). This option allows self-reference within the resulting NSArray or NSDictionary instead.

These proposed changes are a little different than what is currently done, specifically:

Any other options anyone can think of that I'm missing/should be added?

cmsj commented 8 years ago

Huge +1 for AsDataOnly. Can't see anything missing, thanks for doing this :)

asmagill commented 8 years ago

Maybe this is the better place... @cmsj, @lowne (and others, I just know you two have an interest in this), I need some input...

First the question above about the default (i.e. no options specified) for converting an NSNumber for an unsigned long long to lua... should the "numberness" (magnitude?) of the number be the default or should the bits be of more import... it affects whether we convert a large (i.e. the value is > 0x7fffffffffffffff) unsigned long long to a lua_Number (double) or a lua_Integer (signed long long)... I forget why I originally chose the "numberness", but as a programmer, I'm now thinking the bits are more important; since this will differ from the master branch, I wanted to solicit input.

Second, I added this to the conversation of the WIP pull request, but in a nutshell, should the default (i.e. no options specified) for an unrecognized type be an error and return the equivalent of nil, or should it be silently ignored? see #705 for a more complete discussion of this item.

Remember, an error in LuaSkin won't actually stop processing... the question here is whether a bad value in a table causes the entire conversion to return nil or be silently skipped over, thus no longer being a "true as possible" representation of the data (indices would change in arrays for example... though if I change some of the logic, it should be possible to switch it from an array to a keyed-table/dictionary (for lua->NSObject)... slightly slower for arrays, though, since the safest way to do this is assume we're dealing with a dictionary from the beginning and only convert it to an NSArray if we can validate that it's non-sparse with integer only indices starting at 1).

cmsj commented 8 years ago

I haven't given it a huge amount of thought, but I think I'd expect it to err on the side of returning a larger type than missing an element or failing. We're never going to get a 100% mapping between ObjC types and Lua types, but I don't think there's a particularly huge disaster if a valid number is represented in a lua_Number vs a lua_Integer?

cmsj commented 7 years ago

So, to bring this issue back to its original topic - Lua errors. I've come across something very interesting in the LuaCocoa source - they patch Lua ever-so-slightly so that ObjC exceptions thrown inside a pcall context, will be caught and handled (specifically, an ObjC backtrace is printed).

This might do some interesting things for our crash reports, where someone is passing some broken data in a callback, and causing Hammerspoon to crash because that's what we do with ObjC exceptions.

I've ported the patch over and pushed it up as #1094 - thoughts welcome (particularly @asmagill and @lowne :)

cmsj commented 7 years ago

I'm going to close this because I'm gardening the bugs and this has been open for ages with no further discussion, feel free to re-open it if there are specific things you want to do :)