lunarmodules / busted

Elegant Lua unit testing.
https://lunarmodules.github.io/busted/
MIT License
1.39k stars 184 forks source link

Busted's test isolation leaks previously loaded modules into test scope #643

Open alerque opened 3 years ago

alerque commented 3 years ago

See also: https://github.com/lunarmodules/Penlight/issues/363

I recently had the bright idea of using Busted as our test framework for the Penlight project. This seemed to go well at first, until I managed to do something that should have passed our local test specs but happened to break something unrelated to the current test. At this point Busted itself broke. Confused, I did something that should have broken the local test and it passed. Eventually I realized Busted was itself loading Penlight as a dependency and creating a race condition.

While the automatic --auto-isolate feature is enabled and the describe() function is clearing out the loaded module environment so that nothing is in scope, it is not always possible to get a clean module load inside the test. Instead when you load anything that happened to be loaded by Busted previously, you get the cached version of the module not a fresh load!

Of course I want Busted to depend on whatever stable release of Penlight in installed as its dependency, but I also want to be able to load Penlight modules from the local working directory inside test specs to test Penlight itself.

In order to visualize this, I put together a demo repository. There are two Lua modules it creates, a dummy pl.utils and a dummy pl.sip. A .busted file correctly configures the Lua path so that the current project should be tested. Cloning the repo and running busted should just work. The two tests it runs look like so:

https://github.com/alerque/lua-busted-isolation-test/blob/master/spec/isolate_spec.lua#L1-L22

Instead the second test using the dummy pl.sip works fine while trying to use the pl.utils fails because it actually passes the Penlight module previously loaded by Busted for its own internal use. Since Busted doesn't happen to use the real pl.sip module it is not loaded previously and hence loads from the correct place on the first go.

alerque commented 3 years ago

It looks like Busted's context() function is isolating the packages.loaded table from every other context except its own. It isn't clearing it to start with. I'm not sure if this is intentional or not. Some Penlight functionality is exposed inside the Busted runner (as are things from Luassert, etc.) but it doesn't seem like require()ing them should be hijacked like it is now.

I tried writing a --helper script that replaces require() with something that clears the relevant entry in the packages.loaded table before loading it, but that breaks Busted in other ways.

local _r = require
function require (v)
  if v:find("^pl%.") then
    package.loaded[v] = nil
  end
  return _r(v)
end
Tieske commented 3 years ago

insulation will revert the loaded modules to the ones that were loaded at the start of the test I think. So libs loaded during the test will be cleared for the next one. But libs loaded before the tests started will not be removed before a test starts.

The workaround would be to iterate over package.loaded and set every entry matching pl.* to nil. That assumes Busted has already loaded all its modules (otherwise if you call into some busted function during a test, like finally, and it does a require, it will load the PL module to be tested instead of the one Busted was installed with).

This exact behaviour saved us from the LuaJIT ffi segfaults, because we could load the ffi before the tests started and it would not be cleaned by busted anymore (GC'ing that module will lead to segfaults).

Tieske commented 3 years ago

apparently the workaround doesn't work then...

alerque commented 3 years ago

The workaround would be to iterate over package.loaded and set every entry matching pl.* to nil.

I've tried several ways of doing just this but haven't come up with a way to do it pervasively without side effects that break other Busted stuff.

Tieske commented 3 years ago

how about;

alerque commented 3 years ago

It might come down to doing something like that. I'm a bit puzzled why just clearing the loaded table for a module right before attempting to require it doesn't work to refresh the load, but perhaps doing so causes GC and references to the existing load gets lost.

alerque commented 3 years ago

A dodgy hack on the Penlight side that works to a point is to take advantage of the loader caching the pseudo name, not the actual resolved file path. If the Penlight tests require("lua.pl.utils") instead of require("pl.utils") they will get the right module. The problem is that many Penlight modules depend on and load each-other and eventually you get into a jam where the tests are using new code for directly included things but a miss-matched version of other things.

alerque commented 3 years ago

Another possibility is messing with the meta table for package.loaded and fiddling with __index(). I started to play with that but got distracted. Noting another idea for later: maybe use a require() wrapper just for Busted internally that loads normally, then moves the loaded module into a different alias before using it so the in-use references are not to the Penlight names in the loaded table but to aliased Busted versions.

alerque commented 3 years ago

I feel like I've fallen down a rabbit hole and lost track of which way is up. I'm trying to test whether loading a library in a test framework is compatible with using using the test framework to test the library. Even constructing an MWE to say "X arrangement works" or "doesn't work" is mind bending.

Tieske commented 3 years ago

I should have mentioned that each of the package.loaded tables should also have its own lua-path settings. Where the Busted path would not include the path to the local sources to be tested.

This will however impact busted own test suite

alerque commented 3 years ago

As an extra extenuating circumstance here—it seems like testing Busted itself is reliant on the same version of Busted being tested. Making some changes to the way Busted functions that should break the specs don't.

DorianGray commented 3 years ago

It sounds like we will need to rework busted to manage it's own dependencies separate from test dependencies...

Would a require override in busted that copies and replaces .loaded with a new table while saving the original for use inside busted solve this or am I misunderstanding?

alerque commented 3 years ago

@DorianGray Busted already keeps copies (state snapshots) of the .loaded environment for use in isolating tests from each other. What it doesn't do is isolate tests from itself or itself from tests. This is only an issue for a limited number of libraries that happen to both use Busted and be used by Busted.

From my tinkering with this thus far I think Tueske's comment here pretty much map out what needs to happen. There might be an easier way to get the same result by physically replacing require() calls in Busted source code with an internal version that wraps it and uses alternative .loaded tables.

DorianGray commented 3 years ago

It sounds like what I said is exactly what you said which is exactly what tieske said, haha. Sounds like a pretty simple fix then, we'd just need to use "busted_require" everywhere in busted... :/ I don't like the idea of using the debug library like that.

alerque commented 3 years ago

I've spent a couple sessions hacking on this and haven't gotten it working right yet, but am probably closer than when I started by virtue of having learned from false starts. What is your projected release cycle looking like if I were to get a working PR together in the next few days?

alerque commented 3 years ago

I (tentatively) think we only need to use a busted_require() to load external libraries, not busted internals. I think it's okay to have busted's own code in the normal loaded tables. It would be cooler if it wasn't and it was basically completely invisible, but implementing it that way is going to be a huge pain. There are multiple ways in and any wrapper around require() would need to be defined from scratch in several places in order to catch all the possible ways of getting started. At first blush I think all of at leaste busted.lua, busted/init.lua, busted/core.lua, and maybe bun/busted would need their own definitions to bootstrap the process.

At least until I figure out a reason it doesn't make sense I'm going to work on just isolating the 3rd party libraries, not busted from itself. The case of testing busted with itself will still be dicey but at least it can be used to test other libraries that it happens to depend on.

DorianGray commented 3 years ago

I just pinged @ajacksified about getting a release together with me. Once a couple of outstanding issues are out(this one included) we'll do a release of say, luassert, and busted together.

I think this fix is worthy of a release =) Thank you so much for working on it.