tlaplus / tlaplus

TLC is a model checker for specifications written in TLA+. The TLA+Toolbox is an IDE for TLA+.
https://lamport.azurewebsites.net/tla/tla.html
MIT License
2.26k stars 189 forks source link

Proposed refactoring goal: remove all static global variables #891

Open ahelwer opened 3 months ago

ahelwer commented 3 months ago

The tools use static global variables in quite a few places. I've personally encountered them both in the very front end of TLC when reading & setting command line parameter values for the model check run, and way down in the core of SANY which requires some static initialization.

I hope I don't have to write too many words on why static global variables are bad to form a consensus on why we should want to remove them, but just as a concrete example it recently took me three days of painstaking debugging to figure out I had to call these two functions in order to initialize the SANY parser so I could call it directly and have parsing succeed. That was a lot of energy that could have been better directed toward other things, and I anticipate having to expend similar quantities of time & energy to do testing or development on other parts of the codebase. It is simply much, much easier to write tests for things when you can just call a function and don't have to perform some weird incantation first, and don't have to worry about resetting some global state in between tests. For example, it is not possible to run TLC twice within the same JUnit test. At one point in 2020 I hacked together some Java classloader code to unload then load the tlatools jar on request to reset the global state, but it barely worked.

We can broadly split this problem into two categories with some overlap:

  1. Static global variables that are used as invisible parameters affecting the behavior of functions far down the stack. These should be extracted & encapsulated into context objects that are propagated down to whatever function consumes them, and the lifecycle of the context object should be managed by the top-level calling function.
  2. Static initialization of datastructures requiring you to call several other "incantation" functions before a target function will work as expected. The code in these incantation functions should be moved into static blocks that are automatically executed once when the tlatools jar is loaded, or moved into a constructor which must be called before the target function can be accessed.

I understand there is not a great appetite for refactoring in and of itself here, but this particular type of refactoring I believe should be supported and prioritized because of its multiplicative effect on the ease of all other codebase development. What does everybody think?

lemmy commented 3 months ago

Everyone wants the codebase to be of better quality, especially because, as you've observed, too many parts are tightly coupled. Unfortunately, this is also the reason why seemingly simple refactorings tend to break seemingly independent parts in subtle ways. To make things worse, this is typically discovered too late because the tools have various parts with zero or very low test coverage. In other words, how can we guarantee that this refactoring or, for example, https://github.com/tlaplus/tlaplus/pull/756, do not introduce regressions? I believe one step is to deprecate and remove lesser-used or unused functionality. In the meantime, how about adding comments to the existing codebase or starting a developer FAQ so that the next person doesn't spend three days figuring out how to properly initialize SANY?

PS: Rerunning the tools in the same VM can be accomplished by isolating classloaders, as in https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/test/util/IsolatedTestCaseRunner.java, or by nesting an OSGi runtime in tla2tools.jar (https://github.com/tlaplus/tlaplus/tree/master/tlatools/org.lamport.tlatools.api). On top of that, however, we also have to guarantee that the tools do not falsely share on-disk resources (e.g., https://github.com/tlaplus/tlaplus/issues/378).

lemmy commented 3 months ago

Addendum: Leslie Lamport and Simon Zambrowski attempted a similar refactoring a decade ago and gave up because it turned into a time sink.

ahelwer commented 3 months ago

Certainly any such refactoring would be incremental and probably take several years to complete across many PRs.

Calvin-L commented 3 months ago

I am very much in favor.

See also: our contributions wishlist already mentions removing global variables. But, I'm glad you opened this since it will be easier to comment on and reference from other issues. :)