GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
442 stars 63 forks source link

Position tracking by global variable isn't robust #2128

Open sauclovian-g opened 1 month ago

sauclovian-g commented 1 month ago

The "read-only" part of the large state monad TopLevel in Value.hs contains a roPosition member that's supposed to hold the "current position" while executing saw-script code and maybe also at other times.

This is inadequate (even for runtime errors) because in general there are multiple objects with different positions; for example, currently in llvm_points_to there are number of possible errors, some of which are problems with either of the two arguments and should really be reported with their positions, and some of which are problems with the call and should be reported against its position. (Plus if you use llvm_points_to twice on the same pointer, that's an error and it would be nice to report the locations of both uses.)

It's also problematic because it's difficult to know what position is in the global at any given time. It's ostensibly read-only, and it's in the read-only part set up with ReaderT, but all that means is that the position you have at any given point can't be changed by things you call; it can be assigned by anything that calls you. Meanwhile the scope of this monad is nearly all of SAW, so it really is a global variable and it's as difficult to reason about as any other global variable.

Meanwhile, it's also broken in that it's assigned in assorted places in Value.hs and Interpreter.hs, but nowhere near enough of them to actually reflect the current execution position. It's possible that if the interpreter logic were neatly contained within one module, it would be feasible to make sure it gets assigned specifically before any call that leaves the interpreter. However, right now that's not realistic.

On the flip side, because it reaches almost all of SAW it's also used everywhere, and just trying to remove it breaks in all directions. Furthermore, there are lots and lots of error prints that implicitly use it by throwing exceptions that get it added (which is even less robust than fetching it out of the global directly, since the value where the exception's caught and rethrown might not be the same as the one where it was thrown and that might or might not be correct, and even more problematic because that way you really don't know what position you're using) and these should be sorted out as well.

My current inclination is that we should go ahead and rip out the global position and pass positions around explicitly, even though this is bound to be somewhat messy. It will have to be done in multiple stages (one being updating the builtins table so it's possible to pass builtins their invocation position, and ideally also positions for their arguments), and it's even possible that in the long run we'll end up putting it back once things are structured enough to be able to use it reliably. However, for the time being I don't think there's a great deal of choice.

I am thinking though that it might be better to put this on hold until we systematize the error reporting functions, because that will at least move us to explicit positions in places where they're needed and then we know for real where positions need to propagate to.

(Note: there's also a call stack in there with the position, and this should probably get handled the same way; the call stack is part of the position reporting for runtime errors. The way it's updated is even more rickety and I'm kind of surprised it works at all...)