dtr-org / unit-e

A digital currency for a new era of decentralized trust
https://unit-e.io
MIT License
45 stars 15 forks source link

Allow unit tests being run in parallel again #870

Closed scravy closed 5 years ago

scravy commented 5 years ago

Currently unit tests can not be run in parallel. This prevents us from speeding up ci builds (see https://github.com/bitcoin/bitcoin/pull/12926, https://github.com/dtr-org/unit-e/pull/865) and won't work with the updated build definitions from bitcoin 0.17 (#860). This is a regression which was introduced in https://github.com/dtr-org/unit-e/pull/793. This pull request fixes https://github.com/dtr-org/unit-e/issues/861.

The problem is that the StateDB component does something actively when being initialized (not something a component should do), which is to access disk and so on, which can fail. Since there is a UnitEInjector for every BasicTestingSetup running the unit tests in parallel creates a database in the same directory per unit tests suite which clashes and breaks and throws and aborts and dies. This can be prevented by using an in-memory database, but the problem shifts: How to get that configuration in?

I did not want to expose this as a user-definable setting and also not as a chain parameter, so these two configuration options are not available. Which is why I created UnitEInjectorConfiguration. This is something which should not be necessary as usually you would not expose the same module as in production in unit tests but use a subset or just the mocked version, but since the rest of bitcoin is not in well-defined components and accesses them using GetComponent a global injector has to be available in some tests.

The change ultimately does not affect the prod version of things, but the tests, which inject a different UnitEInjectorConfiguration. I made this a struct rather then just a flag unit_tests such that it is extensible in case we have other cases like this. In order to make the injector be able to access fields from it's own instance in the definition of a component I altered the definition of UNMANAGED_COMPONENT a bit.

I would like to point out that these initialization issues would also have happened with an Application class (#723), as that is exactly what UnitEInjector is, just you would inject things manually.

Gnappuraz commented 5 years ago

Concept ACK

frolosofsky commented 5 years ago

The problem is that the StateDB component does something actively when being initialized (not something a component should do)

Do you suggest two-step initialization? I'd love to hear your points why components shouldn't fully initialize themselves during initialization. Elaborating on this topic will help us follow best practices during further component development.

scravy commented 5 years ago

Do you suggest two-step initialization? I'd love to hear your points why components shouldn't fully initialize themselves during initialization. Elaborating on this topic will help us follow best practices during further component development.

I have to admit I am a bit unsure about it myself in this particular case, which is also why I did not do any further changes. But to elaborate on this a bit / brainstorm: In case of the proposer for instance, which is a kind of component which I would call an "active" component (it does thing's on its own, namely proposing), I like to start it explicitly after the whole program is initialized. The reason being: There is, let's say, an RPC component which depends on the proposer but not the other way around. The proposer is initialized. Only then the RPC component is initialized. So it's conceivable that the proposer starts while it can not be controlled though RPC yet – because the system it not completely up yet. I would like the whole system to be in a runnable state but not running - which is how I have seen and written a lot of software successfully. Like (1) Step one initialize (configuration and everything) (2) Step two: Actually start the thing. In code something like: App app; app.start().

In case of the database I am unsure because checking that we can write to a database and everything is part of bringing the system into a runnable state, or put it another way: Start() which initializes the database does not make too much sense.

So while my initial thought was "it should not do this" I think I have come to terms with it and think that it's right the way it does it. Just if if was doing something actively, start outbound connections, start proposing, these kinds of things, I'd want it to be behind a Start() method. The injector already has this capability of stopping components which do have a Stop() method and same logic could be applied for Start(), but so far we do not have a terrible lot of use cases for this.

scravy commented 5 years ago

Just throwing this out there: maybe, instead of adding a factory function for each component, even though we only need it for InjectorConfiguration, we could overload Unmanaged::Init so that is accepts both a pointer and a factory function?

That would be very nice I guess. Although I am personally not too big a fan of overloading functions I think this would be a sweet addition. I would nevertheless follow up on this in a separate pull request.