Closed b-slim closed 6 years ago
That's why I proposed https://github.com/druid-io/druid/pull/2286 as the communication / http endpoints, https://github.com/druid-io/druid/pull/1576 to do the management and coordinator side. This approach was originally for dimension extraction lookups, but might be able to be adopted / modified for this case.
@drcrallen unless i am missing something and don't see how #2286 or #1576 cover the first item which is the first basic brick. Then of course will come to the point on how to do the distributed config process and thought you had a couple of meeting with @guobingkun to hopefully agrees upon one way to do the thing
In order to bring the query time lookup [QTL] to a production ready state, Druid need to have a central configuration and management layer. This management layer is suppose to provide: 1 Static (via property file) and Dynamic (via Coordinator at runtime) registration/unregistration of
LookupExtractor
implementations. 2 Periodic checkpointing ofLookupExtractor
instances, in order to be able to restart after failures or manual restart of the druid process. This layer of management can be split to XX pieces: -1LookupRefManager
that exposes listing/adding/deleting ofLookupExtractor
references PR 2291. -2LookupHttpEndPointResource
that exposes listing/adding/deleting via HTTP endpoint (this resource will depend on point 1LookupRefManager
). -3LookupConfigurationLoader
This piece will be responsible to load configuration for runtime property file, checkpoint periodically the current lookupExtracor object and reload both files after restart. -4LookupCoordinator
This piece will be running on the coordinators and will perform distributed configuration of lookupExtractor. All those pieces will be part of the druid core in order to guaranty homogeneous and clear way to manageLookupExtractor
, then every one if free to actually implement theLookupExtractor
interface.