Design modular architecture for plugins, non-core functionality

opentripplanner / OpenTripPlanner

An open source multi-modal trip planner

http://www.opentripplanner.org

Other

2.2k stars 1.03k forks source link

Design modular architecture for plugins, non-core functionality #2625

Closed drewda closed 3 years ago

drewda commented 6 years ago

e.g. business logic, customer-specific etc. that could live outside architecture

sdjacobs commented 5 years ago

We created a plugin infrastructure in a branch of OTP which I'm hoping will be applicable here.

The main design goal was to able to support plugins that are either extremely deployment-specific (ie business rules for a specific transit agency) or have dependencies to other systems (like logging to AWS Cloudwatch.) Plugins should be in separate repositories from OTP so that they can evolve on a separate path without involving the rest of the project. Hopefully, with plugins, OTP developers that have really specific needs that aren't appropriate for the mainline project can easily stay up-to-date with the code, and more easily merge in new functionality to OTP. In the abstract, there are a few different types of plugins I can imagine wanting:

new APIs: any kind of one-off API that wouldn't necessarily be appropriate for the OTP mainline repo
custom implementations of OTP services, for example a custom Fare Service
event-based "hooks" into OTP to enable logging of trip plan requests or GTFS-RT match percentages

The OTP branch with the plugin infrastructure is here: https://github.com/camsys/OpenTripPlanner/tree/plugins-off-dev-1.x And an example plugin repository is here: https://github.com/camsys/mta-otp-plugins/tree/vanilla-otp

The implementation is as follows:

There's a new class called PluginManager that is created in OTPMain, and is a singleton for the life of the server.
It reads in a configuration file (passed to OTP via --pluginConfig path/to/plugin-config.json) and creates the classes which are listed in that file
All plugins need to be implementations of Pluggable. They define a list of Class<?> types which they can listen for (called "subscriptions")
PluginManager keeps a map of subscription types to Pluggables, and provides a publish() method for any part of OTP to publish an object. When an object is published, all plugins which subscribe to that object get called with their receive() method.
The plugin repository (in my case, mta-otp-plugins), has OpenTripPlanner as a dependency and is compiled to a shaded jar with OTPMain as the Main class. (to get all the plugins in the classpath)

mta-otp-plugins includes a couple example plugins:

an example plugin which registers a new API call
an example plugin that publishes trip plan requests to a Amazon SQS queue for another system to consume
example plugin which publishes RT trip matching percentages to Amazon Cloudwatch
example custom fare service

Example configuration file:

{
  "plugins": [
    {
      "class": "com.camsys.otp.plugins.cloudwatch.CloudWatchService",
      "config": {
        "accessKey": "insert-access-key",
        "secretKey": "insert-secret-key",
        "autoScaleGroup": "insert-autoscale-group",
        "environment": "insert-environment"
      }
    },
    {
      "class": "com.camsys.otp.plugins.reporting.SQSPlanReportingService",
      "config" : {
        "nQueueCapacity": "1000",
        "queueUrl": "insert-url",
        "accessKey": "insert-access-key",
        "secretKey": "insert-secret-key"
      }
    },
    {
      "class": "com.camsys.otp.plugins.api.MtaApiPlugin"
    },
    {
      "class": "com.camsys.otp.plugins.fare.CustomFareService"
    }
  ]
}

There's a little bit of code needed in OTP itself to hook into those plugins, but the idea is, the "hook" code may be broadly applicable to other users of OTP (or at the least, has a limited code footprint) while the specific plugin implementations may not be.

One note: OTP has quite a lot of "services," which I believe are a legacy of when Spring was used in the project. The effect of this is that OTPServer has references to some services, the Graph object itself has references to services, and the Graph has a map of service classes to services. I think this is somewhat related to plugins, in that services provide an opportunity for custom implementations, but also somewhat orthogonal. My proposed solution does NOT attempt to rationalize the services in OTP, but that would be another worthy goal.

Thoughts on this? The implementation I linked to is pretty specific, but I think that the overall design goals may be broadly shared. If we get some buy-in on the design goals, I'll submit a PR and we can move discussion there.

drewda commented 5 years ago

@sdjacobs thanks for sharing -- this sounds like a great option to consider.

A proposal:

add your notes as a new issue to the roadmap for v1.5 (replacing the placeholder card at https://github.com/orgs/opentripplanner/projects/1#card-11441582). Goal will be to add "hooks" flexible enough for CamSys's range of MTA plugins. Others can comment and help shape the PR, but we keep the scope tight enough for you to have something to merge in v1.5
this issue stays on the roadmap for v2.0 -- to everyone an opportunity to address larger questions like yours about the "services" and to also revisit and extend the plugin architecture, if limitations are found by users of v1.5

Sound good to you all @sdjacobs @abyrd @t2gran (and any other @opentripplanner/plc folks who have thought about plugin architectures)?

abyrd commented 5 years ago

Thanks for sharing this idea and your implementation of the system @sdjacobs.

I know there was some interest in the idea of plugins and extensions from @t2gran and @gmellemstrand as well. I can certainly see the use cases. This will be a pretty big decision and will require formal discussion by the PLC. Just to give a little background: indeed OTP used to be a Spring app, and I went to a lot of effort to eliminate its use of Spring.

Of course I have no problem at all with modular architecture, interfaces with multiple implementations, and separation of concerns, and I'm fine with the idea of assembling an application out of singleton components, one of which is an "application context" that gives each component references to all the other components and configuration.

What was highly problematic with Spring was the assembly of these components via injection magic driven by a very open-ended (probably Turing-complete) XML configuration file. To my mind this was essentially Java programming in XML, and abuse of a data exchange format as a non-typesafe non-imperative programming language with effectively unpredictable execution order and no debugger.

Most of the time users don't need that flexibility. Almost all users of OTP were trying to take the same few basic actions (build a graph with various inputs, start up a web server with a few default parameters) but the mailing lists were filled with discussions of how to glue together obscure blobs of XML, often confusing people who had no idea what Spring was or why they would need to to describe the instantiation and initialization of Java objects in XML just to start testing out this trip planner. Advising people on how to debug or configure what were essentially an unlimited number of different applications cobbled together out of the same components was a major time sink.

There was also a stylistic tendency to abstract everything even when there was only one implementation. I agree that this good from a conceptual engineering point of view, that you want a new "service" to have a well defined interface and behavior, but after dealing with the rigidity and readability side effects over the years I now lean toward factoring out these interfaces only once there is a clear need for different implementations.

So back to the plugin proposal: my main concerns are the ways in which this overlaps with the ousted Spring framework. Listing the names of Java classes to instantiate in JSON files raises red flags, as does the Map<Class<?>, X> pattern to subscribe to messages. It is possible to just instantiate custom Java objects, initialize and integrate them with OTP imperatively in Java code, in which case type and syntax checking are applied, and debuggers can trace execution.

It would be straightforward to define public listener interfaces within OTP, and any special deployment of OTP could be implemented as Java code that declares OTP as a dependency, and registers some listeners when starting up OTP.

sdjacobs commented 5 years ago

Yes, that's fair enough: listing class names to instantiate could be somewhat opaque, and maybe it's over-engineered. To have a plugin system, all we really need is: (a) a consistent place in the code to put "plugin" public interfaces by convention, and (b) a consistent place to hang implementations of those listeners. (a) could just be a particular package.

My preference for (b) is "PluginManager" as defined above, but with a member for each interface rather than a Map. I think the existing code where some services are hung off Graph and some off OTPServer is messy. If we want, we could move those services to PluginManager too (and perhaps rename it to ServiceManager or similar.)

abyrd commented 5 years ago

Can you confirm that your idea of plugins is currently limited to publish/subscribe model with custom listeners for predefined events / types? If so this model makes sense to me. The plugin manager with a List<Plugin<?>> for each kind of plugin (e.g. List<Plugin<RoutingRequest>> routingRequestSubscribers) and methods for specific types like publish(RoutingRequest req) { for (Plugin<RoutingRequest> subscriber : routingRequestSubscribers) subscriber.send(req); makes sense to me. I suppose this could get unwieldy if many different kinds of publishable objects are allowed, in which case we could go more generic, but in the short term I'd rather spell out all the supported types. We might want to define some rules about published objects being treated as immutable by plugins.

It also occurred to me that plugins will need configuration information, so they should probably implement the JsonConfigurable interface. The main OTP startup method could just accept a list of plugin objects, on which it would call plugin.configure(jsonConfigRootNode).

sdjacobs commented 5 years ago

Yes, this model makes a lot of sense for the "listener" use case.

There are two use cases this doesn't work for: (a) adding API calls to OTP, (b) custom implementations of OTP services, e.g. new fare services. I think this could just be handled by adding other interfaces which aren't listeners to the PluginManager.

When I get the chance, I'll update the branches I linked to above to reflect this discussion.

t2gran commented 5 years ago

I am sorry for not responding to this earlier.

I am in favor of modular design and encapsulation. I agree that Spring XML configuration is not good, but I am ok with Spring Java Config, which I think is the way to go these days. But, I don´t think we need Spring just for dependency injection - it is not that hard to do.

So my take on this is - when someone need to substitute the existing behavior, refactor, so there is a clear role/interface that the existing code implement. I agree with @abyrd - don´t do it for everything. Place the new code in org.opentripplanner.extensions.[<org>.]<extension-name> or in your own Git Fork. Then use a configuration parameter to make the extension substitute the default implementation.

Spring separate out construction from the and "doing the job" in a good way. Lets look at a example on who we could do this in a similar way:

org.opentripplanner.featureX
    config
        ConfigX.java
    x-package
    y-package
    RoleInterface.java
    XService.java

So when needed (not for everything) create local packages named config and put ConfigX in it. The ConfigX is responsible for wiring together featureX - then it becomes simple to pass in a flag to the ConfigX to swap part of the code. Note! Putting the config in a separate package avoid circular dependencies, and ConfigX may depend on the extensions package. The RoleInterface define a part that can be exchanged.

One thing that is a bit tricky is handling scope. Wiring the application on startup is straight forward since the main method can create and use the top level Configuration object. But the configuration need to be available at request time, so there need to be "global" instance to be able to retrieve "singeltons" and create request scoped instances. Having a global context or using a static instance may solve this. The Graph class does this job today, with respect to services.

How to configuration is a topic on its own, but Andrews suggestion works fine. I dont think we need something special for plugins. Just switches or list of tags for what tot load. Do not put class names in config files - they are implementation details, use aliases or tags instead. That allows for renaming classes/packages without breaking someones config.

So, let look at the case where several features need read access to request/response, it could be:

logging service
security allowing operators ONLY accessing their own and partners data
and so.

So make the RequestResponsePublisher in the appropriate package, also make two interface here defining the role: RequestListner and ResponseListener. Since there can be 0 to many listeners, there is no need to make a default implementation. Put you listener implementations in the extensions package, and wire things with a config.

Today "everything" live in the Graph - e.g. Services. I suggest making an application context, which hold the graph. Then move everything that should not be part of the Graph to the app. context and let the context be loaded based on configuration, not the configuration of the serialized graph.

So I guess I am NOT in favor of a global PluginManager, I want a just a global Context, which can give me the wanted config; which can be used to create/retrieve pluggable services.

OTP 1.x and OTP 2.x - I will probably not work to much on OTP 1.x, so I am happy to support the above suggested solution for 1.x, but I am not in favor of doing it in OTP 2.x.

Thomas

abyrd commented 5 years ago

@t2gran thanks for your input, this is very helpful. I do like the idea of borrowing a few useful ideas from Spring without actually using Spring. I actually like the idea of clearly defined components holding references to one another, often via an application context object. This general idea has been retained in OTP though it really needs to be cleaned up. But as you say, this can be done by convention, in simple readable code, without using an external "framework" or "technology".

Your packaging and structure suggestions seem good to me. About the global context: OTP had this idea, but it has become quite fragmented and seriously needs to be cleaned up. As Spring was eliminated, things were refactored such that org.opentripplanner.standalone.OTPServer replaced the Spring application context. Most of its contents were pushed down into org.opentripplanner.standalone.Router as it was refactored to allow multiple quasi-isolated OTP instances in a single OTPServer. I believe even more things were then pushed down even further, into each Router's Graph to ensure they were in scope during routing (which you have observed). For some reason (perhaps no rational reason), each Graph does not know which Router or OTPServer it is inside, so doesn't really see the application context.

In 2.x OTP it's likely we'll remove the multiple router support (as discussed in #2696). So we'd end up with some kind of OTPContext that is a bit like OTPServer, Router, and all the pluggable services in Graph combined into one. All top level OTP components could hold a reference to this OTPContext, allowing them to keep one another in scope. Things like plugin modules and graph updating modules should have strict rules about how they behave, so would not be given references to the top level application context, only allowing actions to be taken through public interfaces.

When you say you don't want a plugin manager, are you thinking each fork of OTP (or project wrapping OTP to extend it) would actually include changes to the Java code to add more config flags, allowing that particular fork to instantiate and attach its custom-defined extensions to the application context? This seems fine to me - unless there's a compelling reason to do otherwise, I much prefer for organizations to have their own custom forks of OTP, and we have nicely designed extension mechanisms and package structure that mean these people can merge the latest OTP release into their fork with almost no risk of conflicts. Or even better, if their new code is isolated in the extensions package and not instantiated by default, we can be very lenient about merging it into mainline OTP.

In your final line you say:

OTP 1.x and OTP 2.x - I will probably not work to much on OTP 1.x, so I am happy to support the above suggested solution for 1.x, but I am not in favor of doing it in OTP 2.x.

Is the "above suggested solution" the plugin manager, or the whole plugin concept described in these comments? Are you saying you don't want any plugin mechanism in 2.x, or you don't want this specific PluginManager mechanism, but would want the modular application context?

t2gran commented 5 years ago

I want a plugin mechanism in 2.x. I think we can do it with a "Spring like" dependency injection and and a mechanism to read in configuration from config files. I look forward to clean up the OTP code.

sdjacobs commented 5 years ago

OK, I had originally hoped to be able to solve the plugin problem without getting into the larger app context/Spring replacement/services question, but it seems like we can't do the former without the latter (which is fine, as long as we're making progress and driving towards a good technical solution).

To summarize/rephrase, it sounds like we're converging on something like:

have an app context class which holds all global and graph-specific services
"plugins" are simply interfaces (or collections of interfaces) which also live in the app context
plugins and/or other implementations of services can be instantiated by OTPMain (or a similar class) via configuration
define a package for custom extensions and configuration code to live in, and be relatively lenient about merging in code here.

In this model, there's nothing "special" about plugins except that occupy slots in the context class which can be empty (e.g., the default would be to have NO ResponseListener).

Is that an accurate summary?

For 1.x, implementing the minimal viable version of this proposal could be as simple as:

Give Graph a reference to OTPServer
Any "plugins" would be interfaces added to OTPServer, with their implementations in a defined package
Add configuration/setup code to OTPMain
Optionally, push services up from the Graph to OTPServer (or perhaps this is a bad idea if 1.x users may have multiple routers.)

sdjacobs commented 5 years ago

One more thing - Thomas's comments imply a forked model, where people who have custom implementations fork OTP and add code, possibly merging it back to an extensions package. I had suggested a model where OTP is included as a dependency in a different Java package. Specifically, I thought that model would help isolate implementation-specific dependencies -- for example, if I wrote a plugin to push metrics to AWS Cloudwatch, it seems like AWS-specific dependencies would (and should) never get merged back into the dependencies for OTP.

Thoughts on this? Of course it's always possible for people to use OTP as a dependency however they want, but should we encourage that for custom modules, or encourage forking (at the risk of forks which can never get merged back to master due to dependencies)?

t2gran commented 5 years ago

One more thing - Thomas's comments imply a forked model, where people who have custom implementations fork OTP and add code, possibly merging it back to an extensions package. ... (@sdjacobs )

I think people can decide if they want to fork or have OTP as a dependency - I do not see a big difference. By overriding a few assembly/configuration classes you should be able to inject your own additional/alternative code. We do want to encourage people to submit code useful to others, hence the #2745 Sandbox for experimental features.

To be clear, we do not need 3 different concepts for generic "plugins"(Modules, Services, Plugins), we need one. Also, most of these do not need to be part of a global context, if we separate application construction - injecting all pluggable parts - the need for a global context is not there. We inject at construction time. If we need it, then we add it - but lets start without.

I am also in favor of making as concrete extensions as possible. In steed of having a generic Service to modify Itineraries, we can make an itinerary filter chain. We allow for registration of filters/decorators (implementing an interface). We can define (if needed) a contract to add new endpoints, and so on.

There is several things here witch I think we should create separate issues for:

Cleanup the existing code and possibly remove Services and modules that is used in only a few places. For example the FareService is used in one place, but the graph still have a reference to it. Instead the FareService should be injected into its proper place. The FareFactory can still be used to create the service(s). To do this we also need to refactor out the assembly of the application (Injecting config, creating components and putting them together).
Sandbox for experimental features #2745.

t2gran commented 4 years ago

I mean the Sandbox feature solve this issue in OTP2. Is there anyone who want to do this or port the Sandbox to OTP1, if not we can close this issue.

When reading through all of the comments above, there is one thing @sdjacobs requested that I have not replied too:

I had suggested a model where OTP is included as a dependency in a different Java package. Specifically, I thought that model would help isolate implementation-specific dependencies -- for example, if I wrote a plugin to push metrics to AWS Cloudwatch, it seems like AWS-specific dependencies would (and should) never get merged back into the dependencies for OTP.

Linking (packaging 3rd party Jars into otp.jar) can be achieved within the Sandbox model by including "optional" dependencies in Maven and using a Maven profile to include them - the code is then visible to everyone, but unless you use it it is not part of the jar. A second way to do it - if you prefer using OTP as a lib or fork, is to make the necessary extension-points in the code, and make a dummy sandbox module with doc, witch serve as a proxy for the real implementation you have in your own code. This then become documentation, and you can include unit-tests in OTP to make sure someone else do not break the extension-point.

Good examples that this work in practice is the data-store implementation in OTP2 where we added support for Google Cloud Storage, and on its way support for AWS as well.

Can we close this issue now?

t2gran commented 3 years ago

The https://github.com/opentripplanner/OpenTripPlanner/blob/dev-2.x/ARCHITECTURE.md is a good stating point for reading about the OTP design/architecture. There is still room for improvement here, but we have a place for the documentation "close" to the code, so hopefully it will be easier to update and improve over time.

OTP2 have gone a long way to support a more modular architecture and to be able to support custom code:

The internal transit core model no longer depend on 3rd party libraries and can be changed without any risk of changing any of the APIs.
- Each API have their own model.
- The Sandbox allow support for experimental features and custom plugins
- The configuration loading have its own model and reuse components for parsing and validation. Config is now injected into other modules reducing coupling and allowing us to organize the config independent of the implementation.

Closing this issue does not mean the job is done, but rather that we have taken the first step and that updating and improving the architecture should be part of the day-to-day development.