The lookup table system needs more lifecycle states to allow better management of tables, caches and adapters.
It should be possible to disable lookup tables, caches and adapters so they don't consume resources when they are not in use
Incomplete configuration of a lookup table, cache or adapter might also represented by a separate state or might reuse the disabled state
When data adapters or caches fail to start it needs to be reflected by a lifecycle state (see #4522)
(more ?)
This needs some more thinking and discussion to make sure we cover all needed lifecycle states.
We should also think about implementing are more generic lifecycle system which can be reused in other systems as well to avoid creating new solutions for the same problem over and over again.
Current Behavior
For the threatintel plugin in 2.4 we needed some way to disable lookup data adapters to make sure the adapters don't consume resources and do remote requests by default. To avoid any more server core changes, we modified the affected data adapters to throw an exception when the adapter should be disable. This is why we see exceptions like this when Graylog is starting with disabled threatintel data adapters:
2018-01-24T22:34:57.229Z ERROR [LookupDataAdapter] Couldn't start data adapter <tor-exit-node/5a342bdf2c1e3e4f8a4fd826/@7da63b87>
org.graylog.plugins.threatintel.tools.AdapterDisabledException: TOR service is disabled, not starting TOR exit addresses adapter. To enable it please go to System / Configurations.
at org.graylog.plugins.threatintel.adapters.tor.TorExitNodeDataAdapter.doStart(TorExitNodeDataAdapter.java:73) ~[?:?]
at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
at com.google.common.util.concurrent.Callables$4.run(Callables.java:122) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]
We knew that we need something better than this in the future, but for 2.4 we decided to do it like this.
Lifecycle Dependencies
Expected Behavior
When a data adapter or cache blocks during startup, it shouldn't prevent all other lookup tables from starting. The failing data adapter or cache should only affect those lookup tables that use them.
The problem is, that LookupTableService#startUp() is only starting the lookup tables once all caches and data adapters either started successfully or failed to start. When a data adapter or cache is blocking, no lookup table gets started until the adapter setup unblocks.
Lifecycle States
Expected Behavior
The lookup table system needs more lifecycle states to allow better management of tables, caches and adapters.
This needs some more thinking and discussion to make sure we cover all needed lifecycle states.
We should also think about implementing are more generic lifecycle system which can be reused in other systems as well to avoid creating new solutions for the same problem over and over again.
Current Behavior
For the threatintel plugin in 2.4 we needed some way to disable lookup data adapters to make sure the adapters don't consume resources and do remote requests by default. To avoid any more server core changes, we modified the affected data adapters to throw an exception when the adapter should be disable. This is why we see exceptions like this when Graylog is starting with disabled threatintel data adapters:
We knew that we need something better than this in the future, but for 2.4 we decided to do it like this.
Lifecycle Dependencies
Expected Behavior
When a data adapter or cache blocks during startup, it shouldn't prevent all other lookup tables from starting. The failing data adapter or cache should only affect those lookup tables that use them.
Current Behavior
In https://github.com/Graylog2/graylog2-server/issues/4748 a single data adapter was blocking because it was downloading a very large CSV file from a HTTP server. This prevented all lookup tables from starting.
The problem is, that
LookupTableService#startUp()
is only starting the lookup tables once all caches and data adapters either started successfully or failed to start. When a data adapter or cache is blocking, no lookup table gets started until the adapter setup unblocks.https://github.com/Graylog2/graylog2-server/blob/a61d597c837d8a58581ce75e4d5b1f1cf70b74a3/graylog2-server/src/main/java/org/graylog2/lookup/LookupTableService.java#L113-L128