pokt-network / pocket

Official implementation of the Pocket Network Protocol v1
https://pokt.network
MIT License
63 stars 33 forks source link

[Core] Improve dependency injection for submodules #810

Closed bryanchriswhite closed 1 year ago

bryanchriswhite commented 1 year ago

Objective

Improve the dependency injection experience for submodules (e.g. P2P's PeerstoreProvider, telemetry's TimeSeriesAgent, etc.). I think this may generally apply to submodules which need to embed the base_modules.InterruptableModule.

Origin Document

Observation made while working on #732. In order to complete the peerstore provider refactor (#804).

When retrieving modules from the bus (via the module registry), they are asserted to the correct interface type by their respective Bus#Get_X_Module() method. In contrast, retrieving submodules from the registry must be done directly (at the time of writing) which requires additional type assertions and boilerplate in each place any submodule is retrieved from the bus.

Goals

classDiagram 

class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class ModuleFactoryWithOptions {
    <<interface>>
    Create(bus Bus, options ...ModuleOption) (Module, error)
}

class Module {
    <<interface>>
}
class InjectableModule {
    <<interface>>
    GetModuleName() string
}
class InterruptableModule {
    <<interface>>
    Start() error
    Stop() error
}

Module --|> InjectableModule
Module --|> IntegratableModule
Module --|> InterruptableModule
Module --|> ModuleFactoryWithOptions

class Submodule {
    <<interface>>
}

Submodule --|> InjectableModule
Submodule --|> IntegratableModule

class exampleSubmodule
class exampleSubmoduleFactory {
    <<interface>>
    Create(...) (exampleSubmodule, error)
}

exampleSubmodule --|> Submodule
exampleSubmodule --|> exampleSubmoduleFactory

exampleModule --|> Module

Concrete Example

Below, rpcPeerstoreProvider MUST implement Module so that it can traverse the ModuleRegistry dependency injection system that we currently have, as it's used outside of the P2P module in the CLI. This results in it embedding the noop implementations of InterruptableModule and being additionally over-constrained by the InitializableModule#Create() interface method:

classDiagram 

class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class PeerstoreProvider {
    <<interface>>
    +GetStakedPeerstoreAtHeight(height int) (Peerstore, error)
    +GetUnstakedPeerstore() (Peerstore, error)
}

class rpcPeerstoreProvider

rpcPeerstoreProvider --|> PeerstoreProvider
rpcPeerstoreProvider --|> Module

class Module {
    <<interface>>
}
class InitializableModule {
    <<interface>>
    Create(bus Bus, options ...ModuleOption) (Module, error)
    GetModuleName() string
}
class InterruptableModule {
    <<interface>>
    Start() error
    Stop() error
}

Module --|> InitializableModule
Module --|> IntegratableModule
Module --|> InterruptableModule

Requiring rpcPeerstoreProvider (the injectee) to implement InitializableModule fosters boilerplate around the respective constructor function and everywhere it's injected. Here is an excerpt from p2p/providers/peerstore_provider/rpc/provider.go:

func NewRPCPeerstoreProvider(options ...modules.ModuleOption) *rpcPeerstoreProvider {
    rpcPSP := &rpcPeerstoreProvider{
        rpcURL: fmt.Sprintf("http://%s:%s", rpcHost, defaults.DefaultRPCPort), // TODO: Make port configurable
    }

    for _, o := range options {
        o(rpcPSP)
    }

    rpcPSP.initRPCClient()

    return rpcPSP
}

func Create(bus modules.Bus, options ...modules.ModuleOption) (modules.Module, error) {
    return new(rpcPeerstoreProvider).Create(bus, options...)
}

func (*rpcPeerstoreProvider) Create(bus modules.Bus, options ...modules.ModuleOption) (modules.Module, error) {
    return NewRPCPeerstoreProvider(options...), nil
}

The Create() function isn't currently used anywhere in the codebase, same goes for the rpcPeerstoreProvider#Create() method, which only serves to satisfy the InitializableModule interface requirement. This increases complexity and reduces readability and maintainability on both the injectee and injector side in my opinion.

Here is an excerpt from p2pModule which illustrates the complexity this design introduces on the injector side (which will present everywhere a submodule is retrieved from the ModuleRegistry):

// setupPeerstoreProvider attempts to retrieve the peerstore provider from the
// bus, if one is registered, otherwise returns a new `persistencePeerstoreProvider`.
func (m *p2pModule) setupPeerstoreProvider() error {
    m.logger.Debug().Msg("setupPeerstoreProvider")
    pstoreProviderModule, err := m.GetBus().GetModulesRegistry().GetModule(peerstore_provider.ModuleName)
    if err != nil {
        m.logger.Debug().Msg("creating new persistence peerstore...")
        pstoreProviderModule = persABP.NewPersistencePeerstoreProvider(m.GetBus())
    } else if pstoreProviderModule != nil {
        m.logger.Debug().Msg("loaded persistence peerstore...")
    }

    pstoreProvider, ok := pstoreProviderModule.(providers.PeerstoreProvider)
    if !ok {
        return fmt.Errorf("unknown peerstore provider type: %T", pstoreProviderModule)
    }
    m.pstoreProvider = pstoreProvider

    return nil
}

I would prefer to be able to do something like this:

classDiagram 

class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class PeerstoreProvider {
    <<interface>>
    +GetStakedPeerstoreAtHeight(height int) (Peerstore, error)
    +GetUnstakedPeerstore() (Peerstore, error)
}

class rpcPeerstoreProvider

rpcPeerstoreProvider --|> PeerstoreProvider
rpcPeerstoreProvider --|> Submodule
rpcPeerstoreProvider --|> rpcPeerstoreProviderFactory

class Submodule {
    <<interface>>
}
class InjectableModule {
    <<interface>>
    GetModuleName() string
}

class rpcPeerstoreProviderOption
class rpcPeerstoreProviderFactory {
    <<interface>>
    Create(bus Bus, options ...rpcPeerstoreProviderOption) (rpcPeerstoreProvider, error)
}

rpcPeerstoreProviderFactory --* rpcPeerstoreProviderOption

Submodule --|> InjectableModule
Submodule --|> IntegratableModule
var (
    _ modules.Submodule                    = &rpcPeerstoreProvider{}
    _ peerstore_provider.PeerstoreProvider = &rpcPeerstoreProvider{}
    _ rpcPeerstoreProviderFactory          = &rpcPeerstoreProvider{}
)

type rpcPeerstoreProviderOption func(*rpcPeerstoreProvider)
type rpcPeerstoreProviderFactory = modules.FactoryWithOptions[peerstore_provider.PeerstoreProvider, rpcPeerstoreProviderOption]

func Create(options ...modules.ModuleOption) *rpcPeerstoreProvider {
    return new(rpcPeerstoreProvider).Create(nil, options...)
}

func (*rpcPeerstoreProvider) Create(bus modules.Bus, options ...rpcPeerstoreProviderOption) (peerstore_provider.PeerstoreProvider, error) {
    rpcPSP := &rpcPeerstoreProvider{
        rpcURL: fmt.Sprintf("http://%s:%s", rpcHost, defaults.DefaultRPCPort), // TODO: Make port configurable
    }

    for _, o := range options {
        o(rpcPSP)
    }

    rpcPSP.initRPCClient()

    return rpcPSP
}
// setupPeerstoreProvider attempts to retrieve the peerstore provider from the
// bus, if one is registered, otherwise returns a new `persistencePeerstoreProvider`.
func (m *p2pModule) setupPeerstoreProvider() (err error) {
    m.logger.Debug().Msg("setupPeerstoreProvider")
    m.pstoreProvider, err = m.GetBus().GetPeerstoreProviderSubmodule()
    if err != nil {
        m.logger.Debug().Msg("creating new persistence peerstore...")
        m.pstoreProvider = persABP.NewPersistencePeerstoreProvider(m.GetBus())
    } else {
        m.logger.Debug().Msg("loaded persistence peerstore...")
    }
    return nil
}

Deliverable

Non-goals / Non-deliverables

General issue deliverables

Testing Methodology


Creator: @bryanchriswhite

Olshansk commented 1 year ago

The Create() function isn't currently used anywhere in the codebase

It seems like as the code evolved organically, we create factory functions starting with New as well as Create. IMO, we should grep for all func NewXXX and remove them altogether.

Q1: Wdyt about the point above?

Here is an excerpt from p2pModule which illustrates the complexity this design introduces on the injector side (which will present everywhere a submodule is retrieved from the ModuleRegistry):

Q2: What if we simply treat all submodules the same as modules and have no differentiation between the two? Would this simplify things

// setupPeerstoreProvider attempts to retrieve the peerstore provider from the // bus, if one is registered, otherwise returns a new persistencePeerstoreProvider.

Q3: What if we just panic if you try to access an uninitialized submodule instead? Alternatively, since they all have Create factory functions, we can centralized the logic for creating the singletons based on which Factory interface it implements. Let me know if I should provide a pseudo code snippet for how this would work


In general, I'm very open to refactoring anything as long as we have time, capacity and sanity to do it. Follow up questions:

Q4: What you be able to pick this up? If so, which iteration do you think it fits in and do you think it'll be a Medium or Large effort (once we iron out the details)?

bryanchriswhite commented 1 year ago

The Create() function isn't currently used anywhere in the codebase

It seems like as the code evolved organically, we create factory functions starting with New as well as Create. IMO, we should grep for all func NewXXX and remove them altogether.

Q1: Wdyt about the point above?

I have to think on it a bit more to figure out if I have an opinion on Create() vs NewXXX() but it sounds like we're in agreement that there is duplication currently that we can reduce.

Here is an excerpt from p2pModule which illustrates the complexity this design introduces on the injector side (which will present everywhere a submodule is retrieved from the ModuleRegistry):

Q2: What if we simply treat all submodules the same as modules and have no differentiation between the two? Would this simplify things

I think we can for the most part with the exception of having submodules conform to the ModuleFactoryWithOptions interface. This is a problematic combination in my opinion because of the impact that it has on the usability of the constructor (as illustrated in the example). I imagine that it could be sufficient to continue to use ModuleRegistry for submodules, I'm just proposing that we make the necessary modifications to support an improved developer experience when retrieving them.

// setupPeerstoreProvider attempts to retrieve the peerstore provider from the // bus, if one is registered, otherwise returns a new persistencePeerstoreProvider.

Q3: What if we just panic if you try to access an uninitialized submodule instead? Alternatively, since they all have Create factory functions, we can centralized the logic for creating the singletons based on which Factory interface it implements. Let me know if I should provide a pseudo code snippet for how this would work

Sounds interesting but I think I would need to see a snippet to understand what you mean. To add context to the code comment, the scenario where the peerstore is retrieved from the bus is when the CLI starts up its P2P module. Before it does so, it creates and adds an rpcPeerstoreProvider to the module registry. The the CLI P2P module subsequently comes up, it will use the rpcPeerstoreProvider instead of creating a persistencePeerstoreProvider.

In general, I'm very open to refactoring anything as long as we have time, capacity and sanity to do it. Follow up questions:

Q4: What you be able to pick this up? If so, which iteration do you think it fits in and do you think it'll be a Medium or Large effort (once we iron out the details)?

Happy to pick it up! I would estimate it as a Medium which I would distribute between making a change to the ModuleRegistry and then making the resulting refactoring changes. I get the impression that the first part is likely a minimal change but the second part holds most of the uncertainty in this issue.

All that being said, I don't see this as particularly urgent and I'm not sure that there's a place in the sprint(s) where it naturally fits but I would personally prefer to resolve this before adding any new code which injects and/or retrieves submodules from the module registry. Wdyt about making it a stretch goal for next sprint or something like that?

bryanchriswhite commented 1 year ago

Adding some diagrams that I don't think make sense to include in the PR description but may make sense to include as documentation at some point:

Example Module

classDiagram
class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class ModuleFactoryWithOptions {
    <<interface>>
    Create(bus Bus, options ...ModuleOption) (Module, error)
}

ModuleFactoryWithOptions --> Module
ModuleFactoryWithOptions --* "0..n" ModuleOption
ModuleOption --> Module

class Module {
    <<interface>>
}
class InjectableModule {
    <<interface>>
    GetModuleName() string
}
class InterruptableModule {
    <<interface>>
    Start() error
    Stop() error
}

Module --|> InjectableModule
Module --|> IntegratableModule
Module --|> InterruptableModule
Module --|> ModuleFactoryWithOptions

exampleModule --|> Module

Example Submodule without config or options

classDiagram
class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class InjectableModule {
    <<interface>>
    GetModuleName() string
}

class Submodule {
    <<interface>>
}

Submodule --|> InjectableModule
Submodule --|> IntegratableModule

class exampleSubmodule
class exampleSubmoduleFactory {
    <<interface>>
    Create(bus Bus) (exampleSubmodule, error)
}

exampleSubmodule --|> Submodule
exampleSubmodule --|> exampleSubmoduleFactory
exampleSubmoduleFactory --> exampleSubmodule

Example Submodule with options

classDiagram
class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class InjectableModule {
    <<interface>>
    GetModuleName() string
}

class Submodule {
    <<interface>>
}

Submodule --|> InjectableModule
Submodule --|> IntegratableModule

class exampleSubmodule
class exampleSubmoduleFactory {
    <<interface>>
    Create(bus Bus, options ...exampleSubmoduleOption) (exampleSubmodule, error)
}

exampleSubmodule --|> Submodule
exampleSubmodule --|> exampleSubmoduleFactory
exampleSubmoduleFactory --> exampleSubmodule
exampleSubmoduleFactory --* "0..n" exampleSubmoduleOption
exampleSubmoduleOption --> exampleSubmodule

Example Submodule with config

classDiagram
class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class InjectableModule {
    <<interface>>
    GetModuleName() string
}

class Submodule {
    <<interface>>
}

Submodule --|> InjectableModule
Submodule --|> IntegratableModule

class exampleSubmodule
class exampleSubmoduleFactory {
    <<interface>>
    Create(bus Bus, config exampleSubmoduleConfig) (exampleSubmodule, error)
}

exampleSubmodule --|> Submodule
exampleSubmodule --|> exampleSubmoduleFactory
exampleSubmoduleFactory --> exampleSubmodule
exampleSubmoduleFactory --* exampleSubmoduleConfig
exampleSubmodule --> exampleSubmoduleConfig

Example Submodule with config & options

classDiagram
class IntegratableModule {
    <<interface>>
    +GetBus() Bus
    +SetBus(bus Bus)
}

class InjectableModule {
    <<interface>>
    GetModuleName() string
}

class Submodule {
    <<interface>>
}

Submodule --|> InjectableModule
Submodule --|> IntegratableModule

class exampleSubmodule
class exampleSubmoduleFactory {
    <<interface>>
    Create(bus Bus, config exampleCfg, option ...exampleOpt) (exampleSubmodule, error)
}

exampleSubmodule --|> Submodule
exampleSubmodule --|> exampleSubmoduleFactory
exampleSubmoduleFactory --> exampleSubmodule
exampleSubmoduleFactory --* "0..n" exampleOpt
exampleOpt --> exampleSubmodule
exampleSubmoduleFactory --* exampleCfg
exampleSubmodule --> exampleCfg
Olshansk commented 1 year ago

@bryanchriswhite I reviewed the diagrams you provided above very carefully and am 100% aligned with the approach.

As we discussed offline, I think we should

  1. Remove all func NewXXX in the codebase
  2. Update all existing modules in the codebase in the next iteration
  3. Potentially uncover other issues (or not)
  4. Update https://github.com/pokt-network/pocket/wiki/Modules-Readme w/ what you have outlined above

Do you have any other questions or feedback or anything? From my perspective, the above is a pretty clear and well-formed solution.