Design - Githubissues

knownasilya commented 9 years ago

Picking up from https://github.com/Strider-CD/strider/issues/667

Idea

Thin API that communicates with drones and supports plugins. No UI in core, since it can be implemented as a plugin.

Goal

Lay-down a basic design and non-negotiables that should be in core.

TODOs

[ ] Look over Strider 2.0 Brainstorming issue (https://github.com/Strider-CD/strider/issues/667) and pull out what needs to make it into this implementation.
[ ] Cover the following areas
- [ ] Drone abilities/communication
- [ ] Authentication/User Model
- [ ] Plugin API
- [ ] core
- [ ] drone
- [ ] What cannot be in core (important so we don't loose sight).
- [ ] Long term goals
- [ ] Testing
- [ ] Documentation
[x] Create issue to discuss technology stack (#2)

knownasilya commented 9 years ago

Here's what I envision so far:

Core has the idea of groups, users, projects, environments, providers, drones, plugins, and workflows. Drones have the idea of plugins.

A user can be part of a group.
A group/user can have multiple projects.
A project has to have at least one environment.
An environment has at least on drone.
A drone has to have a workflow.
A workflow consists of plugins and the steps for a build.
Workflows can be reused for drones.
Steps in a workflow are sequential actions coming from plugins.

Here's the workflow I imagine. You create a project (under group or user) and you assign a name, vcs url, and the basic environment name (as well as permissions, etc). You configure the environment based on how it's triggered e.g. git commit, pr, tag, webhook, etc. You create a drone (drones are global and can be shared across projects) which gives you a token with the necessary information to connect the drone to core. Install the drone on the machine with the given token. Now you create a workflow on core (also global) which uses plugins. A workflow is setup by selecting a provider, and a sequential order of steps e.g. git: clone repo, node: install packages, sauce: test, docker: push. Now you assign the workflow to the drone. The drone does any initial setup (if needed) for that workflow and is marked as ready. Now the drone is ready for data. You now add this drone to your environment. When the environment is triggered it sends the necessary information to each drone that it has.

Drones should be clustered so that they don't die and you have to connect to those machines every time. Workflow setup will also install the necessary plugins on the drones. I can see drones as having one endpoint that takes an action and data; the actions could be like setup and trigger.

I think plugins will need to be on both, since workflows need to be created and we need to know what steps are part of a plugin. Workflows are necessary since we don't want the user touching the drones, since ideally you should ever touch them once, when you create them (or update).

Having multiple drones in an environment will allow parallel builds. Also versioning plugins and drones to a specific core API version will save a lot of frustration I think. Also workflows should be exportable/importable and shareable as JSON via a gist or something.

kfatehi commented 9 years ago

Having it such that it is the user responsibility to manage the drones (ssh in to installs things like node dependencies, heroku toolbelt, ssh keys for being able to pull, strider plugins, etc) drastically reduces core's responsibilities.

if a [core] plugin writer wishes to automate drone management he or she may do so and simply ask for, in a UI, for an IP address, SSH port, username, password, or private key. but I think it's asking too much to do this in core

knownasilya commented 9 years ago

I think you are right about half of it, but plugin management on drones should totally be in core. By that I mean plugins get installed automatically via some hook triggered by core. I think this is fundamental, because no one wants to manage drones individually, it's error prone and a waste of time. Sure let the user install the drone, install node/npm, or python, etc. themselves, but the plugins (npm based) should be managed from core.

kfatehi commented 9 years ago

If we're strictly speaking about a plugin from npm or git getting remotely installed, configured, maintained by core, then I think that is fine. I was talking about machine provisioning, not so much drone runtime. I agree w/ you with that regard, although I am curious what such plugins would be like.

knownasilya commented 9 years ago

Yeah, I would love to know what they look like as well ;)

davemackintosh commented 9 years ago

This might be wildly out of spec/whack but wouldn't using something like rabbitmq, eventd, kafka, kestrel, etc as the communication technology between servers and 'core' be the parent node that manages drones. It would be nice because one server can manage many which makes adding/removing servers easy as well as distributed without some arbitrary Node code or any downtime.

It's a good model that we can see working all over the internet as a solution to exactly the problem you're talking about @keyvanfatehi it's also language independent, a thin REST layer on top would make it very manageable by sysadmins and developers.

niallo commented 9 years ago

I would strongly recommend not building a heavy weight dependency like Kafka or RabbitMQ into the app by default.

These systems are nice but not trivial to deploy and manage.

My suggestion would be have something simple built into core (I am a big fan of JSON over HTTP - as its already available in all browsers).

Admittedly this doesn't support node auto discovery but that can be layered by individuals who need it.

Other transports could be supported for discovery and/or persistent queueing through extensions.

I really don't think bundling a JVM or RabbitMQ binary with Strider would be appreciated by users :-)

Also there are many ways to do discovery - for example new nodes can ping the central server over HTTP (or whatever) when they are ready for work.

On Tuesday, July 14, 2015, Dave Mackintosh notifications@github.com wrote:

This might be wildly out of spec/whack but wouldn't using something like rabbitmq, eventd, kafka, kestrel, etc as the communication core between servers core be the parent node. It would be nice because one server can manage many which makes adding/removing servers easy as well as distributed without some arbitrary Node code.

It's a good model that we can see working all over the internet as a solution to exactly the problem you're talking about @keyvanfatehi https://github.com/keyvanfatehi it's also language independent, a thin REST layer on top would make it very manageable by sysadmins and developers.

— Reply to this email directly or view it on GitHub https://github.com/Strider-CD/core/issues/1#issuecomment-121153153.

Niall O'Higgins W: http://niallohiggins.com E: n@niallo.me T: @niallohiggins

knownasilya commented 9 years ago

When you setup a node with the token (which has the core url) it will ping the core and authenticate making it ready for additional configuration. This could be done with http. If the node doesn't have access to core, then you could ping the node from core or use a proxy.

If we want status checking, we could use long-polling it's fast enough for pinging. For stdout data, we can just stream the http request.

microadam commented 9 years ago

Websockets would work well for that sort of thing. PrimusJS and the WS module work well together for a persistent client / server architecture (have implemented something similar)

On 14 Jul 2015, at 18:51, Ilya Radchenko notifications@github.com wrote:

When you setup a node with the token (which has the core url) it will ping the core and authenticate making it ready for additional configuration. This could be done with http.

If we want status checking, we could use long-polling it's fast enough for pinging.

— Reply to this email directly or view it on GitHub.

kfatehi commented 9 years ago

IronMQ provides an http API for messaging.

POST messages GET messages (with long poll support) DELETE messages/id (ack)

Of course this was not great for my application that needed to be realtime, but for CI something like this would suffice, we just have to write all that code (or you know, use ironmq, or cloudamqp and the like)

I agree with not bundling this stuff but I also acknowledge the existence of stuff like cloudamqp and docker that make it easy to use rabbit.

That said, we can easily do a simple MQ concept with http ourselves simply using the database

On Tuesday, July 14, 2015, niallo notifications@github.com wrote:

I would strongly recommend not building a heavy weight dependency like Kafka or RabbitMQ into the app by default.

These systems are nice but not trivial to deploy and manage.

My suggestion would be have something simple built into core (I am a big fan of JSON over HTTP - as its already available in all browsers).

Admittedly this doesn't support node auto discovery but that can be layered by individuals who need it.

Other transports could be supported for discovery and/or persistent queueing through extensions.

I really don't think bundling a JVM or RabbitMQ binary with Strider would be appreciated by users :-)

Also there are many ways to do discovery - for example new nodes can ping the central server over HTTP (or whatever) when they are ready for work.

On Tuesday, July 14, 2015, Dave Mackintosh <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

This might be wildly out of spec/whack but wouldn't using something like rabbitmq, eventd, kafka, kestrel, etc as the communication core between servers core be the parent node. It would be nice because one server can manage many which makes adding/removing servers easy as well as distributed without some arbitrary Node code.

It's a good model that we can see working all over the internet as a solution to exactly the problem you're talking about @keyvanfatehi https://github.com/keyvanfatehi it's also language independent, a thin REST layer on top would make it very manageable by sysadmins and developers.

— Reply to this email directly or view it on GitHub https://github.com/Strider-CD/core/issues/1#issuecomment-121153153.

Niall O'Higgins W: http://niallohiggins.com E: n@niallo.me javascript:_e(%7B%7D,'cvml','n@niallo.me'); T: @niallohiggins

— Reply to this email directly or view it on GitHub https://github.com/Strider-CD/core/issues/1#issuecomment-121275651.

knownasilya commented 9 years ago

@microadam this is server-to-server.

kfatehi commented 9 years ago

Id rather avoid websockets to be honest. My early vote is simple messaging over rest API with pluggable backend so if folks wanna use rabbit they can. Tough to achieve though because you cannot ack a packet if you close the channel. Rabbit works way better when it's not wrapped and hidden. Maybe this kind of plugin is exemplary of one that would exist in core and drone side

kfatehi commented 9 years ago

s/exemplary/an example/ can't edit comments on iPhone for some reason

phiros commented 9 years ago

@knownasilya I like your idea so far. However, I think we shouldn't focus on the clustered architecture just yet. For now I think we should mainly focus on the interaction between plugins (I very much like your workflow idea) and the interaction between the plugins and core. If we get the plugin API right we can later implement whatever crazy (or sane) job distribution or output transfer scheme we can come up. The reason why I insist on this so much is that I already tried to make strider (the current version) truly distributed. Whilst trying to do so I hit several hard to overcome road blocks. Nearly always the problem was a too strong coupling between plugins (I found especially the runner plugins and the strider ui very cumbersome to deal with).

Therefore, I propose that we should split each component (by component I don't only mean plugins I also mean functional units in plugins and in core) of strider into microservices. In case some of you are not familiar with microservices: think of them as very lightweight REST API services which only implement some basic functionality (e.g. one possible microservice could, given a Github pull request sha-string check whether this pull request is already being processed by strider). I'd recommend we use seneca for this. This would have several advantages:

Clear separation of concerns.
Easy to test (input: foo; response should be: bar)
Replacing functional units which are hidden behind a microservice is usually trivial
Interaction of plugins with each other is already networked (if we do it right we will never see crashes due to misbehaving plugins again)
Allows for fast prototyping (allows us to think about how the microservices/plugins should interact with each other instead of what they do in detail). To put it in Java / C terms: it allows us to think about the interface first and the implementation later
Seneca provides a database abstraction layer (would make it possible to use a in-memory database for smaller setups and something else for larger setups)
If we build everything as microservices we get a REST API more or less for free
Language and technology independence: using microservices would allow people to write plugins in other languages (maybe it will turn out that language x has better support for y etc.).

phiros commented 9 years ago

Also: maybe it wasn't completely clear from my ramblings in my previous post but I think we should proceed as follows:

Put as much as possible of the functionality in core and the plugins behind microservices
Use a persistence abstraction layer preferably hidden behind a microservice
Mock out the parts which are harder to implement (job distribution etc.) and care about the gritty details later. You probably already guessed how I think the mocking should be done (if not: dummy microservices)

BTW: are there any plans when we will start with the implementation for this thing? I'd be more than interested to help.

knownasilya commented 9 years ago

Can start as soon as you want. I just pushed up some basic structural code/tests with hapi (I'm really liking it).

knownasilya commented 9 years ago

Technology stack discussion should happen in #2

microadam commented 9 years ago

Primus and WS work server to server

On 14 Jul 2015, at 21:44, Ilya Radchenko notifications@github.com wrote:

@microadam this is server-to-server.

— Reply to this email directly or view it on GitHub.

jpic commented 9 years ago

I have a design question: shouldn't we use a version prefix in the HTTP API urls and shouldn't we implement HATEOAS ?

knownasilya commented 9 years ago

Yes on the prefix, I've already added that. No on HATEOAS, since that's not a standard practice for node rest apis (more of a java/soap practice), I'd rather implement http://jsonapi.org

davemackintosh commented 9 years ago

+1 for JSON API.

Strider-CD / core

Design #1

Idea

Goal

TODOs