gocd / gocd

GoCD - Continuous Delivery server main repository

https://www.gocd.org

Apache License 2.0

7.12k stars 973 forks source link

Feature: pipeline configuration from source control #1133

Closed tomzo closed 8 years ago

tomzo commented 9 years ago

This feature made it! The documentation in GoCD is here.

I have collected most notable information from the comments below so that no one has to read all that again to get an idea what has changed in the system and what is expected from it.

Overview of the feature

user can define pipelines and environments in many source code repositories - configuration repositories
pipeline belonging to specific group can be specified in configuration repository.
configuration in repos may have references to main xml or to other repos. E.g. pipeline in repo A depends on other pipeline in main xml.
user can provide a plugin to interpret contents of single checkout of config repository in any custom way. E.g. pipelines defined in yaml
if configuration repo is the same (by global fingerprint) as one of pipelines scm material then they are treated as one.
scm-config consistency - pipeline running on source code at commit C1 will use configuration at commit C1 as long as they are in same repository.
environments from many config repo sources get merged together with environments from main xml. So that environment definition can be all in one repository or actually spread across many configuration sources.
pipelines from many config repo sources get summed together with pipelines from main xml.
Example configuration repositories

I have prepared example config repositories. In order of complexity:

https://github.com/tomzo/gocd-main-config contains main cruise XML configuration. The one stored in /etc/go/cruise-config.xml
https://github.com/tomzo/gocd-indep-config-part - XML configuration part with no external references.
https://github.com/tomzo/gocd-refmain-config-part - XML configuration part that refers to pipelines from main.
https://github.com/tomzo/gocd-refpart-config-part - XML configuration part with references to other configuration part repository
https://github.com/tomzo/gocd-json-config-example - JSON configuration part

In main config https://github.com/tomzo/gocd-main-config there is config-repos branch with config-repo sections to import elements from the other repositories.

Domain and concepts

Configuration repository

Configuration repository is a source control repository of any kind that holds part of Gocd configuration. So far we referred to this as config-repo or partial. However 'partial' should really be reserved for the object of configuration. While repository is the remote code, yet to be fetched and parsed.

ConfigOrigin

Tells where some part of configuration comes from. It was necessary to add, because now some services need this extra info to operate. There are 3 types of configuration origins:

the old XML file
one of configuration repositories
web UI (added in last branch 1133-ui)
Base and Merged configuration

There are 2 scopes of configuration:

base - all configuration in cruise-config.xml, also stored and committed in internal configuration git repository
merged - cruise-config.xml + all remote, parsed elements.

These are important at system level because we consider validity twice, first at base scope, then at merged.

Behavior and assumptions

cruise-config.xml is always valid by its own when no parts are yet appended. Just like it was so far - meaning this feature is not breaking current xml-config stability.
when server (re)starts it loads only main configuration from xml. So for a while remote pipelines are not in Go server. Then we wait for material updates defined config-repo and configuration merging kicks in, each partial gets merged into current config.
there is never a situation when invalid merged config is considered as current config
Elements defined in configuration repository should be rendered in UI but editing via UI should be disabled.
Significant cases

When pipeline is defined in configuration repository, there are always 2 cases which actually define how Go server should behave.

When configuration repository that defines the pipeline is the same as one of materials

In automated builds we expect that when pipeline is triggered with material at revision C1, then configuration of the pipeline will be from the same commit - C1. There is a small (unavoidable) inconsistency here - when there are few quick commits (C1, C2, C3) made, that change pipeline configuration, then Go may pick them up faster than finishing already running builds (E.g. Configuration has been updated to C2, when stages on C1 are is still running). It may lead to failing a build that would have passed if the commits were slower. However IMO this is good after all, the quick commits usually would be done because somebody wanted to fix the previous configuration. There is no way to avoid it because only one pipeline configuration can exist at a moment.

In manually triggered builds Go always fetches materials first, which may change the configuration of pipeline that we just triggered.

when changes fetched changed the pipeline configuration then it just runs on new configuration
when changes fetched removed current pipeline then build is canceled.
when changes fetched made current merged configuration invalid, then it will run on old configuration and display a warning.

In timer triggered builds Go also fetches materials first, which may change the configuration of pipeline that is being triggered.

when changes fetched changed the pipeline configuration then it just runs on new configuration
when changes fetched removed current pipeline then build is canceled.
when changes fetched made current merged configuration invalid, then what? (I can't see what option is sensible at all, each has major drawbacks).
When configuration repository that defines the pipeline is not one of materials

This case is much less complex. Go is always polling for changes in configuration repositories and tries to merge them to current configuration. The rules are the same as if the incoming changes were done from UI.

Failures

Hung material

What happens when one material polling gets hung:

when config repo and pipeline material is the same - latest partial is used. Pipelines that use that material do not get auto-scheduled anyway. No harm. Manual trigger can still be issued.
when config repo and pipeline material are different - latest partial is then old. if pipeline would schedule then it would use configuration from old commit in config repo with new commits from material repos.
Failed parsing

When plugin fails or configuration has invalid format or migration fails in configuration repo checkout then material update completes but config partial is old.

when config repo and pipeline material is the same - if pipeline would schedule then it would use old configuration with new commit violating scm-config consistency. it is not allowed to schedule until partial is fixed. (Actually this is implemented by canceling build )
when config repo and pipeline material is different - same as in hung case. (latest partial is then old. if pipeline would schedule then it would use configuration from old commit in config repo with new commit from scm repo.)
Handling merges and conflicts

How to handle merging configuration parts and main configuration?

Merges are done at object-level. (Meaning first all XML and all repositories are parsed to create BasicCruiseConfig and PartialConfig, then an aggregate object is created - BasicCruiseConfig with merge strategy)
According to rules written below
Environments

Pipelines in environment

Most liberal approach possible:

if any new pipeline name appears then consider it member of environment.
If pipeline name repeats among many configuration parts then just ignore repetition.
Agents in environment

Most liberal approach possible:

if any new agent uuid appears then consider it member of environment.
If agent uuid repeats among many configuration parts then just ignore repetition.
Environment variables in environment
if any new 'variable1=value1' appears then consider it member of environment.
if 'variable1=value1' repeats then just ignore
if first part has 'variable1=firstvalue' and second part has 'variable1=othervalue' then it is a conflct and merged config is invalid.

There could be optional overrides but we can consider it future work.

Pipelines

Final pipeline groups get created as a sum of pipelines in groups in partial configurations
if there are 2 pipelines with the same (case insensitive) name then it is a conflict, configuration is invalid.

Authorization can be only in main xml so it cannot conflict when merging.

System

Some notes about changes in how Go services work and what is happening when configuration repositories are present.

Services

Here is a summary of new services layout:

Below GoConfigService

renamed (with refactoring) GoConfigDataSource to GoFileConfigDataSource
moved implementation of CachedGoConfig to CachedFileGoConfig
added GoRepoConfigDataSource - holds recent configuration repository parse result (PartialConfig or exception). It is called from top with clean checkout prepared already.
added GoPartialConfig which holds latest set of successfully parsed partial configurations.
added MergedGoConfig where CachedGoConfig used to be - there were many references to old class CachedGoConfig. Now they all reference MergedGoConfig instead. MergedGoConfig understands multiple configuration sources (parts and main).
added CachedGoConfig interface. Implemented by MergedGoConfig and CachedFileGoConfig. Public methods look like in the old CachedGoConfig class. It used only to test against. Best explanation is in commit message https://github.com/tomzo/gocd/commit/80706b39783d286184028c6ab3bc673b38f67bf3
GoConfigFileDao is renamed to GoConfigDao. Almost no changes here.
added GoConfigWatchList - keeps track of list config-repos that should be polled and parsed. Fires events when list has changed.
added GoConfigPluginService - provides a config plugin implementation by name. This service is still TODO and currently always returns default gocd-xml plugin.

The best analogy to get the whole point here is that MergeGoConfig has replaced the old CachedGoConfig. It used to be that CachedGoConfig had 2 instances of configuration in memory (for edit and current config). Now there is MergeGoConfig that has these two. But main difference is that MergeGoConfig may return merged configuration as current config or for edit. If there are no extra configuration parts then it returns the main configuration.

Above GoConfigService

This is implemented mostly how we discussed here

New material update queue

Added new queue - config-material-update-required - Materials which are configuration repositories are always requested on that queue.
All other materials are on the old queue material-update-required
MaterialUpdateService understands both these queues and schedules update accordingly.
Unloading queues
Previously 10 MaterialUpdateListeners were unloading material-update-required, then talking to MaterialDatabaseUpdater and posting MaterialUpdateCompleted messages to material-update-completed. Now using the same classes there are additional 2 MaterialUpdateListeners unloading from config-material-update-required and posting to config-material-update-completed.
MaterialUpdateService does not listen on config-material-update-completed topic.
ConfigMaterialUpdater - new component

Added new component - ConfigMaterialUpdater which listens on config-material-update- completed topic. So when MDU is done then ConfigMaterialUpdater gets its chance to work with material being updated:

It uses MaterialRepository to check if there were any changes
It uses existing pollers code to checkout material to directory
It calls GoRepoConfigDataSource (where parsing happens) and when done
it posts to material-update-completed which is picked up by MaterialUpdateService using standard procedure as if this was an old-school material. This removes material from inProgress status.
Final service notes

Reuse pollers directories

The checkouts (in pipelines/flyweight) are NOT done/updated by standard material pollers when doing update on db (MDU).

But now there is new type of poller that creates full checkout on each update. These directories are now read and parsed by configrepo plugins.

Handling edits

Merged cruise config is returned for edits. When some service is editing the config it does not know if the config is merged or not. It does not have to know.

Adding

When method to add pipeline or environment is made then it reaches merged cruise config at some point. It is then aware that we meant to add in the main part and changes the main config instance (inside the merge cruise config instance).

Removing

Removing is like adding. We can localize where to remove from. If user tries to remove remote element then it fails. Usually it would fail in the cruise config code.

Modifications

Modifications get complex because there are many ways in which they are introduced. This is where there is real benefit from returning merged config instance. Changes are made on the config instance in full merged context so that when anything invalid is attempted then it will throw. E.g. when trying to change name of pipeline group defined remotely.

Saving changes

Each config edit ends with attempt to save some config for edit instance (or deep clone of it, or clone of a clone, etc.). To deal with that - magical writer is aware of possibility that merged config might be passed to be serialized. If so then it takes out only locally defined configuration elements. Actual extraction of local elements is implemented in config-api and it is very easy because we keep and maintain the main configuration instance inside merged config anyway.

Pull requests

These are either merged or planned pull requests to make all above work:

1276 - (merged) - domain changes
1330 - (merged) - new xml schema
1331 - (merged) - services below GoConfigService
1332 - (merged) first (internal) plugin which allows to define configuration in repository in XML
1333 - (merged) new MDU workers and second material queue
1536 - (merged) ensures scm-config consistency
1810 - UI changes to disable editing remote configuration. + #1827 to improve user experience
1825 - configrepo extension point

Original post from May 2015

Motivation

Being a big fan of keeping all project-related code in its source code repository I would really like to be able to declare pipeline configuration in the source code of each individual project instead of the global cruise-config.xml. Many people will agree that each project's code should know how to build and test itself. Following this concept it should also know how to CI-itself.

Problem

Currently when all go configuration is in global configuration file on server we basically end up with 2 sources of projects configuration - one being git repository, the other a file on go server. There are lots of scenarios when new changes in git repo will cause the build to break because they expected different pipeline configuration. Or rather pipeline configuration expected the older git repo contents.

Concept

In order to avoid such conflicts probably the <pipeline> section should never be in the global cruise-config.xml, instead go-server should configure pipelines after pooling from source repositories.

Final notes

Is anyone interested in such feature or am I crazy? Please provide some feedback on how would you like to see this?
How do you (or your organization) handle the problem described above?
I am not a gocd developer and I am unfamiliar with its source code or development process. But I learn fast and I am very determined to get this done.
I would like to kindly ask the core developers of gocd to the right direction on getting this implemented. What components will need to be updated? How invasive would it be? Can configuration loading and applying be easily replaced to the general schema I described above.

matt-richardson commented 9 years ago

One thing to consider is that you'd have to know the xml schema - you wouldn't be able to use the UI to edit it... Unless there was some way of sync'ing the config for a pipeline back to source control.

zabil commented 9 years ago

There's another conversation on #838 which talks about this. Here are a few to consider.

Source code is a type of material to a pipeline

Handling pipeline groups and role based auth.

Making pipelines available while configuring (environments)[http://www.go.cd/documentation/user/current/configuration/managing_environments.html]

Related to code.

For learning how the config xml is loaded/parsed/validated https://github.com/gocd/gocd/blob/master/config/config-server/src/com/thoughtworks/go/config/MagicalGoConfigXmlLoader.java

For operations on the pipeline https://github.com/gocd/gocd/blob/master/server/src/com/thoughtworks/go/server/service/PipelineService.java

Please feel free to join our gitter for more help/discussions.

tomzo commented 9 years ago

@zabil thanks for the tips. I think considerations from https://github.com/gocd/gocd/issues/838 are very much related. I am actually starting to think that the way configuration is handled should be rewritten.

There are a few points in https://github.com/gocd/gocd/issues/838 that form a great specification. Just to clarify:

100% backward compatibility
no more big global xml configuration
validations must exist, but they should not be a bottleneck
xml should not be the only option - suggested by @mrmanc
so possible many sources of pipeline configuration

There is this long comment https://github.com/gocd/gocd/issues/838#issuecomment-73085929 which I think is great insight on this.

My idea of work towards new config implementation

Create abstract configuration provider.
- many implementations are possible (including the old one)
- Components above should reference only this class. It's role would be to load and validate configuration object from whatever backend is used. It would provide the only source of truth for the rest of go.
Put old implementation as first implementation of the abstract provider. This will guarantee backwards compatibility.
Create a new shiny configuration provider - that should be separate issue.

The implementation could be auto-detected based on contents in go-server configuration directory. E.g. when there is cruise-config.xml then load the old one.

Pipeline configuration as material?

Disclaimer: this is just an idea Haven't you noticed that updating pipeline configuration triggers a built just like it would in case of a new commit in material? I think this is a hidden symptom of not entirely true domain model. Lets say that every pipeline has at least a config material which defines all contents of pipeline. Then

when central go-configuration source is used it is just a config material of all pipelines of organization. Which I think it is the truth when using global config.
when many config sources are used they would show up as config materials only to the pipelines that these sources define.
in very specific case config material and git material could be the same repository - like I wanted in the first place.

What do you think?

Finally I would like to ask for status of work towards any of the above. The performance issue seems to have ended on that is has to be split in parts, but I don't see to many. I can see the moving go agents config but IMHO this is just a minor solution to larger problem. Please approve (or disapprove) this direction, because I really do not want to get alone into such heavy work and not get merged.

PS: @matt-richardson I think it would be easy to sync back from UI to source. It could work like editing code on gitlab or github where you just have to add commit message in the end.

zabil commented 9 years ago

@tomzo we are still looking at putting in more fixes to solve immediate issues described on #838.

On direction, waiting for our BDFL @arvindsv to get back from vacation to comment.

mrmanc commented 9 years ago

If we do go down the route of having pipeline configuration as a material it would be great if the first thing the pipeline did was verify the integrity of the pipeline: named dependencies etc. Then you still fail fast and highlight any problems with the person who made the change.

arvindsv commented 9 years ago

@tomzo: I think it's a big discussion, part of which has happened in #838, as you mention. Most of what you say about the abstract configuration provider makes sense. In my view, within the code today, CruiseConfig and its whole tree, is a representation of the configuration. It doesn't have any real link to the XML directly (once it is created). So, it could be a start towards that. The config-api was started so that everything related to the config (its "interface", if you will) was moved there. The config-server module has all the XML reading and writing code.

Another approach to thinking about this could be to think about a clean-slate and ideal implementation and see what it will take to get there, rather than trying to extract a good abstraction out of what exists today. That might be harder, in which case we can continue with the approach you brought up.

The fact that you mention that you're determined to get this done is the reason I think this can actually happen, and the part I admire the most, actually. :) So, if you can keep 100% backward compatibility, write well tested code and make different backend providers nearly pluggable, then I don't see why it cannot get merged.

My worry is that this is a bigger task than you or I anticipate and if you get bogged down, you might get discouraged. :) So, if you take small steps towards it, and involve the rest of us in what you're doing, we might be able to help. Of course, you might be the kind who doesn't get discouraged, and that's great! But, still if you involve others, it'll be easier to see progress and maybe merge smaller bits of code than the whole thing at once.

arvindsv commented 9 years ago

You asked about the status of work: For the performance bit, my feeling is that the config save does not need to move out of XML for it to be fast (though it can/should move out for flexibility reasons as you mention). I think that keeping the state of the configuration in memory and not deserializing the whole thing all the time and validating everything all the time will help. @ketan and I have talked about this, and started work towards it, but got distracted. We should get back to it. Elaborating on that idea: For instance, adding a task does not require a full config revalidation. It cannot affect another pipeline or be invalid, unless it is a fetch artifact task. So, handling the validation at that level, and so on. This should be quick. Needs some work to put in the framework to allow that, as well as some changes to the UI controllers.

arvindsv commented 9 years ago

The idea about config material repo(s) might need a little more clarity for me to comment on it, but from what I understand, I think it can be handled as one of the providers. The provider just happens to be reading from a repository of files, and deciding that some of them are pipelines. As long as the config interface is maintained, it should be fine. Of course, there might need to be some work done to poll that repository for changes, especially if you say that the <pipeline> tag itself shouldn't be in the config, but should be in the config material repository itself.

mrmanc commented 9 years ago

If the pipeline tag is not in the configuration, then where would the configuration repository be specified?

I’m still really keen that configuration integrity is enforced, so if configuration was read from a repo, then could it be validated as the first thing a build does? That would keep things failing fast…

arvindsv commented 9 years ago

... then could it be validated as the first thing a build does?

I agree. I think it should, and it probably would be implemented that way.

If you look at the original post, @tomzo said:

In order to avoid such conflicts probably the <pipeline> section should never be in the global cruise-config.xml, instead go-server should configure pipelines after pooling from source repositories.

That's what I was talking about. If <pipeline> is not in the config, then maybe the config has something like:

<config>
    ...
    <pipeline-repo url="..." type="git">
   ...

I don't know. It's just a guess. That's why I mentioned needing more clarity.

But, at a high level, this is about the "pluggability" of the whole config, essentially. That can be done, if we have an abstract provider interface. If we have that, then making it read from a repository of pipeline-level declaration files, instead of one XML should be easy, I feel.

tomzo commented 9 years ago

Validation

For instance, adding a task does not require a full config revalidation. It cannot affect another pipeline or be invalid, unless it is a fetch artifact task. So, handling the validation at that level, and so on. This should be quick.

This is something that I was thinking about as well. There is a lot of configuration that can be completely validated at lower level than global.

Config as material

The idea about config material repo(s) might need a little more clarity for me to comment on it, but from what I understand, I think it can be handled as one of the providers

This is what I had in mind. It seems simple to do assuming going into abstract provider direction.

if configuration was read from a repo, then could it be validated as the first thing a build does?

I think it is the only way to do it. The pipeline objects have to be built from that repo. I assume that we validate while building.

If the pipeline tag is not in the configuration, then where would the configuration repository be specified?

It would be something similar to what @arvindsv guessed. The way I see configuration in far future would be something like

<config>
    <config-part-provider type="file" url="/some/local/path" />
    <config-part-provider type="git" url="some git repo url" />
    ... 
</config>

The idea about config material repo(s) might need a little more clarity for me to comment on it

I do not have details yet but a few problems made me think that pipeline configuration should be a material. Consider this: Lets assume:

we have the git config provider implemented
there is some project with all its code and pipelines defined in single git repo

Then go pulls that repository for 2 reasons - because it is a config and a git material of a pipeline. It should be aware that configuration repository and pipeline material is the same entity in this case. Otherwise it potentially could use different commits for config and the rest of source code, which is the inconsistency that I wanted to avoid in the first place. When there are more pipelines the situation gets even more complex. And the fan-out and fan-in core feature would just solve it as long as it 'sees' configuration as just another material.

There is also this observation that I mentioned earlier - go already behaves as if configuration was a material. But it does not model it like one.

Next steps

So, if you take small steps towards it, and involve the rest of us in what you're doing, we might be able to help.

I was hoping to hear that. I see that it is major work of actually unknown effort. I think I will make first move towards that abstract configuration provider and implement xml config provider. Then I think I should merge to prove that config provider is abstracted and still working. Then I would head towards newer implementation. As I stated initially - I am interested in having provider from git repo(s) so that would be the one I would implement. I do not want to specify more details now before digging deeper into what old implementation looks like.

I was waiting on @arvindsv to comment and started working on something else in the mean time. So I will get back to gocd in few days. When I start I hope to get some help on how old configuration works, (and annoy you a little bit during the day). I hope that is OK with you.

But, at a high level, this is about the "pluggability" of the whole config, essentially. That can be done, if we have an abstract provider interface. If we have that, then making it read from a repository of pipeline-level declaration files, instead of one XML should be easy, I feel

Exactly. I think this is most critical part to start any progress with configuration issues. I also think there should be a very detailed abstract test suite to be defacto specification of it and to run against any configuration provider implementations.

arvindsv commented 9 years ago

When I start I hope to get some help on how old configuration works, (and annoy you a little bit during the day). I hope that is OK with you.

Sure. Let me know when and I'll help you get started.

arvindsv commented 9 years ago

Then go pulls that repository for 2 reasons - because it is a config and a git material of a pipeline. It should be aware that configuration repository and pipeline material is the same entity in this case. Otherwise it potentially could use different commits for config and the rest of source code, which is the inconsistency that I wanted to avoid in the first place.

Right. If commit C1 changes code and commit C2 of the same repo changes pipeline config (which is just a file in the same repo), then there should be a build with both code and pipeline config at C1, and another build with both code and pipeline config at C2. But, there should never be a build where code is at C1 and pipeline config is at C2.

One of the reasons for mentioning <include file="my-file.xml" repo="scripts" type="go-xml" /> in #838 was to get to that. If modeled as:

<pipeline name=...>
  <materials>
    <git name="code" .../>
    <git name="scripts" .../>
  </materials>

  <include file="my-file.xml" repo="scripts" type="go-xml" />
</pipeline>

... then, a change to either repository will trigger a build. But, when modeled like this:

<pipeline name=...>
  <materials>
    <git name="code_and_scripts" .../>
  </materials>

  <include file="my-file.xml" repo="code_and_scripts" type="go-xml" />
</pipeline>

... then, a change to the repository, for either code or pipeline config, will always be consistent. This just reuses Go's usual material and does not explicitly bring in the concept of a config material. I felt that it would make it easier to implement and understand. It is also flexible enough to model both cases.

When there are more pipelines the situation gets even more complex. And the fan-out and fan-in core feature would just solve it as long as it 'sees' configuration as just another material.

Yes. There is a concept of uniqueness of a material, in Go. Materials which are considered unique across different pipelines are not polled multiple times. That'll need to apply here as well.

There is also this observation that I mentioned earlier - go already behaves as if configuration was a material. But it does not model it like one.

You're right. It is modeled only implicitly as a material which is checked for changes. Any changes to it from the filesystem or from the UI does cause the system to change, and potentially trigger pipelines. But, I think changing a task definition (for instance) doesn't cause a pipeline to trigger. Changing a material could. In that sense, it is not a full fledged material.

tomzo commented 9 years ago

Yesterday I started getting familiar with current implementation of Go. I've setup dev. environment and I can debug 'development server' which is a lot of help together with searching through classes in IDEA. I have already spent a few hours trying to figure out how it works and where would be a good point to start at. Could you please correct me if am wrong and answer some of the questions I have below? Is there some technical/architectural documentation on Go that I do not know of?

PipelineConfig class

It seems to me that this what actually defines how most recent pipeline should be build. Correct? Is the PipelineConfig class is fully defined by what tag contains? It is just a part of bigger CruiseConfig. It is what eventually I want in to be created from source control.

PipelineGroups class

what is this class responsible for? and what are these map objects for?

private Map<String, List<Pair<PipelineConfig, PipelineConfigs>>> packageToPipelineMap;
private Map<String, List<Pair<PipelineConfig, PipelineConfigs>>> pluggableSCMMaterialToPipelineMap;

PipelineConfigService class

This class seems not used anywhere. Is there a reason or this is just 'tech dept', old file? I am asking because it seems that it would be easier to start from something like PipelineConfigService than GoConfigService.

`@autowired, @service and @component`

What are these for and how do I use them? It seems they are used in consumer-service cases.

Current git repo for cruise-config.xml

There is some branching and merging functions in the code. Could you explain how is current cruise-config.xml git repository used?

Polling

Please explain how currently polling works. I Can see the poller classes, some messaging code I was running debugger but I cannot get it through my head. What components, events, services, messages are involved?

@arvindsv you have said that

In my view, within the code today, CruiseConfig and its whole tree, is a representation of the configuration. It doesn't have any real link to the XML directly (once it is created). So, it could be a start towards that.

While it is true, it seems to me that CruiseConfig is way too big to build 'configuration provider' of it. Mostly because if you look at its contents it does not make sense to use more sophisticated methods than xml to store stuff like agents and passwords, etc.

I think I am not going to create abstraction because it would be too hard with just one implementation. Instead I am going to head towards implementing feeding configuration from git and then make it work with older xml config.

@arvindsv You mention this approach instead of config as material

<pipeline name=...>
  <materials>
    <git name="code_and_scripts" .../>
  </materials>

  <include file="my-file.xml" repo="code_and_scripts" type="go-xml" />
</pipeline>

But do you see how it could work when many pipelines would be defined from same git repository? I do see that making config as material would be very invasive:

all pollers are in server module while PipelineConfig is in config-server.
there is a lot of code that expects PipelineConfig(s) to be present first before running pipelines (and polling materials?). So that makes a chicken-egg problem when config has to be polled. What makes me think that other component running earlier must be responsible for polling all configuration sources.

There is scms section in CruiseConfig which I never heard of before. If there was a single service responsible for polling all of these scms and presenting their changes further then GoConfigSource could use that and GoConfigFileDao to assemble final CruiseConfig.

Let's call this problem SCM-config consistency :

Right. If commit C1 changes code and commit C2 of the same repo changes pipeline config (which is just a file in the same repo), then there should be a build with both code and pipeline config at C1, and another build with both code and pipeline config at C2. But, there should never be a build where code is at C1 and pipeline config is at C2.

This problem will occur as long as not addressed. I mean no matter the changes we would do now the current model is that there is only one valid, latest PipelineConfig. Which in reality is not true, there is whole history of pipeline config. It should be possible to trigger a build for any commit C2 and use both config and code from that commit consistently. Which again makes me think that there should be a service responsible for:

polling (at least) all source materials where some of which may have pipeline definitions therefore PipelineConfig objects. But not just the last one. Something like ConfigRepository but with smaller scope

arvindsv commented 9 years ago

Note: Some part of conversation about this, here.

tomzo commented 9 years ago

I'd like to add to list of those questions:

How is uniqueness of materials achieved now?

I am asking because I might have some clue on making a config material. But I must hook into previous identity system.

arvindsv commented 9 years ago

I'll answer the earlier questions today (in a few hours, sorry). About the uniqueness of materials: It's a hash of all the relevant properties of a material. The definition of "relevant" is dependent on the kind of material.

Take a look at this, this and this. In this case (SVN), the fields used for uniqueness are type ("svn"), url, username and checkExternals flag.

arvindsv commented 9 years ago

PipelineConfig class

It seems to me that this what actually defines how most recent pipeline should be build. Correct?

Correct.

Is the PipelineConfig class is fully defined by what tag contains? It is just a part of bigger CruiseConfig.

Yes. It's part of the config, and it fully defines what the <pipeline> tag has.

It is what eventually I want in to be created from source control.

Looks like it to me.

PipelineGroups class

what is this class responsible for? and what are these map objects for?

It is responsible for the <pipelines> tag. It is what holds a group of pipelines. A group of pipelines can have authorization related to them. It can be used to separate pipelines in the system into groups or teams, giving them the ability to administer only their own pipelines.

Those two maps can be ignored. They're a local cache for these two methods. They're creating a map from a material to the pipelines they're in, etc. That's an extremely expensive operation, and those maps help to not run them again and again.

PipelineConfigService class

This class seems not used anywhere. Is there a reason or this is just 'tech dept', old file? I am asking because it seems that it would be easier to start from something like PipelineConfigService than GoConfigService.

It's used here. This is from the admin UI. If you try to delete a pipeline which is a dependency material for another pipeline, this code should get executed.

@autowired, @service and @component

What are these for and how do I use them? It seems they are used in consumer-service cases.

They're from Spring. They're used for dependency injection. Maybe these help: http://simplespringtutorial.com/annotations.html http://stackoverflow.com/questions/6594908/spring-autowire-fundamentals

For most service packages, etc. autowiring has been setup (meaning, Spring has been told to scan those packages for annotations such as @service, etc. and to automatically instantiate them and inject their dependencies.

Current git repo for cruise-config.xml

There is some branching and merging functions in the code. Could you explain how is current cruise-config.xml git repository used?

The current git repository is just a store of historical changes to the config. While the real (current) config is usually in a location like /etc/go/cruise-config.xml, the one in /var/lib/go-server/db/config.git is just a copy, and is the latest valid config known. git log on that will show you all the changes made to the config, over time. The log message of every commit has a specific format and the format is used by Go. Making a commit there directly or indirectly is not recommended. The repo is also used to try and merge concurrent changes. If there are 10 pipelines, and I make a change to pipeline 1 and you make a change to pipeline 5, Go uses git to try and merge those changes, so that a user does not need to redo their changes.

I'd recommend not trying to use this repository at all.

I'll reply to the rest of the post in a separate reply.

arvindsv commented 9 years ago

Polling

Please explain how currently polling works. I Can see the poller classes, some messaging code I was running debugger but I cannot get it through my head. What components, events, services, messages are involved?

What polling are you thinking about? Material polling (git, etc)? This might help, as a start. If you mean polling for config changes, Go just checks the config at /etc/go/cruise-config.xml every few seconds to see if it has changed from the latest known config.

I think I am not going to create abstraction because it would be too hard with just one implementation. Instead I am going to head towards implementing feeding configuration from git and then make it work with older xml config.

Ok. You need to remember that not all material information is inside <pipeline>. Package repository plugins and SCM plugins store their material information at the top level, outside of <pipeline> and they have a reference inside the <pipeline> tag.

Later, related to this, you said:

There is scms section in CruiseConfig which I never heard of before.

It is very new. It's for SCM plugins. It's similar to <repositories> tag. That's what I mentioned just above.

But do you see how it could work when many pipelines would be defined from same git repository? I do see that making config as material would be very invasive:

all pollers are in server module while PipelineConfig is in config-server. there is a lot of code that expects PipelineConfig(s) to be present first before running pipelines (and polling materials?). So that makes a chicken-egg problem when config has to be polled. What makes me think that other component running earlier must be responsible for polling all configuration sources.

Not exactly sure I get what you're saying about the pollers. Since server module can access config-server, we could set it up so that the pollers in the server notify some service in the config-server when some "config repositories" have some changes that needs a reload of the config.

You're also right that the current code expects the configs to be present. We talked about this earlier, over chat, I guess. We might have to have a different poller, for config materials only. I can see it as something like this:

Existing pollers continue to do what they do. They don't poll config materials.
We write different pollers for config materials (using the code for git, svn, etc. already present).
These new pollers are used to get a globally valid config, that can be used. Or, we can bring in the concepts of scopes, etc. Once they have a config, the original pollers can use the config from this module (config module) to get the new set of code-materials they need to poll.

If there was a single service responsible for polling all of these scms and presenting their changes further then GoConfigSource could use that and GoConfigFileDao to assemble final CruiseConfig.

Yes, that's another option, pulling out the polling as a module below both the server and config-server, so that they can both access them. The polling module will need to be quite generic. Today, the pollers take the commits they find, and put them into the DB. For config-material, the pollers don't need to do that. They need to provide the commits to the config-server module, which can then take action (refresh its notion of config).

Final post on this coming up.

arvindsv commented 9 years ago

Let's call this problem SCM-config consistency: ... [snip] ... This problem will occur as long as not addressed. I mean no matter the changes we would do now the current model is that there is only one valid, latest PipelineConfig. Which in reality is not true, there is whole history of pipeline config. It should be possible to trigger a build for any commit C2 and use both config and code from that commit consistently.

Currently there's only one valid latest config. It's useful in one way. If you have a build, and it fails, you can change the config and re-run the stage or some jobs and it will use the new config. Not the old failed config. When a pipeline run is tied to its config through a repo, then you lose this ability. A rerun of a stage or job will re-use the config for that time.

I have mixed feeling about this. Though I like the ability to re-run a job, I think re-using the old config is the correct thing to do. However, the more you put into the config from the repository, the more there is a chance to go wrong.

Going back to C1 and C2: As I said, C1 and C2 were commits in the same repository (say R1). So, if config is at C2, then code is also (should also be) at C2, since the repository is the same. Doesn't that address that problem. If we're flexible, we should be able to have it such that code commit C1 comes from repo R1 and config commit C2 comes from repo R2, and it gives the user to mix and match (and possibly run inconsistent config with inconsistent code. Right? There's flexibility in that approach, but it allows inconsistencies. Either way tying code to config in the same repository should solve the problem we were talking about (scm-consistency).

arvindsv commented 9 years ago

Maybe our thinking needs to be broader? Starting with something like:

Aspiration: I want to be able to have all my pipeline configuration information in an (one only, for now) external repository. The config that Go knows about should be like this:

<cruise> (or <go>)
  <server>
    ...
  </server>

  <pipeline-repo plugin="git.config.repo" url="git://something">

  <agents>
     ...
  </agents>
</cruise>

We could then give that config information (url, etc) to the plugin and wash our hands off it. It is the plugin's responsibility now, whenever asked by Go, to give back a list of pipelines. We'd need to figure out environments and other concepts, if we're returning something equivalent to a list of <pipeline> (or PipelineConfig) objects. But, it's a way of thinking.

The plugin can then poll that repository and have some kind of a convention. Save every file with the extension .pipeline is a candidate to be considered a pipeline. It then polls all of them, resolves dependencies between them and gives it back to Go.

To take it further, we can even allow both <pipeline> and <pipeline-repo> to exist in the configuration. It could make it a little harder to make sure that the whole config is valid, but it allows flexibility to move from the current config, without forcing anyone to. Something like this:

<cruise> <!-- or <go> -->
  <server>
    ...
  </server>

  <scms>...</scms> <!-- Used by the <pipeline> tag below. -->
  <repositories>...</repositories> <!-- Used by the <pipeline> tag below. -->
  <templates>...</templates> <!-- Used by the <pipeline> tag below. -->

  <pipeline-repo plugin="git.config.repo" url="git://something_team1">

  <pipeline name="...">
    ...
  </pipeline>

  <pipeline-repo plugin="git.config.repo" url="git://something_team2">

  <agents>
     ...
  </agents>
</cruise>

I'd recommend not having a frankenstein config like this, but it keeps the old config valid, while allowing the big config pieces to be moved out and managed elsewhere. This is just an opinion/idea. I'd like others like @jyotisingh, @mdaliejaz, @zabil, @ketan, etc. to weigh in.

tomzo commented 9 years ago

Thank you very much for all these answers.

What polling are you thinking about? Material polling (git, etc)? This might help

Thanks I skipped that by mistake. It is enough.

They're from Spring

I'll learn spring then.

I'd recommend not trying to use this repository at all.

I thought so, but had to be sure because it could be a hint.

Today, the pollers take the commits they find, and put them into the DB. For config-material, the pollers don't need to do that. They need to provide the commits to the config-server module, which can then take action (refresh its notion of config).

I do not agree with that. That would imply that there is only one valid, most-fresh config.

Currently there's only one valid latest config. It's useful in one way. If you have a build, and it fails, you can change the config and re-run the stage or some jobs and it will use the new config. Not the old failed config. When a pipeline run is tied to its config through a repo, then you lose this ability. A rerun of a stage or job will re-use the config for that time.

I was assuming that if user is going for config in repo then he/she is willing to resign of some of the operations. Above would be an example of that resignation.

At least in git it could be still done by using the commit amends.

I have a few points more. I will post soon.

tomzo commented 9 years ago

I have now considered many of the approaches, which are referenced above and in #838 I was digging in the code to get idea of what can be actually changed with relatively small modifications and adding code rather than changing the old one.

There are few points which I am quite certain about:

CruiseConfig and all xml loading is designed to be static and globally valid. Let's keep it that way. We will just add a new section <scm-configs> with list of extra sources to poll and load. This is similar to what @arvindsv just mentioned above.
definitely configuration object should be modeled as 2 parts: static+dynamic. Static is delivered via xml, it is globally valid, it is the old xml config. Dynamic requires polling to get it, then objects of dynamic configuration are created and passed further.
if there is a SCM configuration material then it is not a material of single pipeline. It cannot be a member of PipelineConfig class. configuration material is a member of class with larger scope - something like PipelineConfigGroup. It would be first step towards those validation scopes we talked about.

arvindsv commented 9 years ago

Today, the pollers take the commits they find, and put them into the DB. For config-material, the pollers don't need to do that. They need to provide the commits to the config-server module, which can then take action (refresh its notion of config).

I do not agree with that. That would imply that there is only one valid, most-fresh config.

Not necessarily only. There is (and I think will be) the concept of a valid, most-fresh config (maybe not at the global level, but at least at the pipeline level). You need this to schedule a new pipeline, when a code commit happens. Of course, that could be a config commit itself, in which case the most-fresh config is the one for that commit. Will have to check validity.

However, you need older commits of the config, only for reruns of an old pipeline, right? So, I don't see why they need to be in the Go DB. Especially if all of this is happening in a plugin. I'd just get the config for that point in time using the repository itself, on demand.

That's what I think. Let me know what I'm not considering.

[Update: Of course, if we want to store it in the DB for some reason for a rerun, it's doable]

tomzo commented 9 years ago

However, you need older commits of the config, only for reruns of an old pipeline, right? So, I don't see why they need to be in the Go DB. Especially if all of this is happening in a plugin. I'd just get the config for that point in time using the repository itself, on demand.

That will do it. I was going for ability to rerun.

There is (and I think will be) the concept of a valid, most-fresh config (maybe not at the global level, but at least at the pipeline level).

Yes. But I was referring to 'refresh its notion of config'

They need to provide the commits to the config-server module, which can then take action (refresh its notion of config)

I just wanted to note that we should not implement a situation when polled configuration part would update some configuration instance, especially CruiseConfig.

I would use the static config and dynamic config separation I mentioned above. I think server module should be aware of that separation. config-server would only provide urls to configuration repos. This is what I came up with after digging in code.

There is also this problem: If there is more than one pipeline in config repo then how do we imagine rerunning just one pipeline at some older revision?

But I think it is already answered above to some extent.

tomzo commented 9 years ago

@arvindsv in the broader approach https://github.com/gocd/gocd/issues/1133#issuecomment-109014208 you mention <pipeline-repo> and that it could return a list of pipelines. And that then we have to figure out environments and other elements. Why not <config-repo> that can return much more than just pipelines? It would be allowed to return pipelines and environments. Anything that would make sense storing in repo.

I am not in favor of adding a huge feature at once. I just think the work towards <config-repo> and <pipeline-repo> would be very much alike.

I would also love to hear opinions of others.

arvindsv commented 9 years ago

Why not <config-repo> that can return much more than just pipelines?

Sure. That's fine. As long as it is not too complicated. I'd leave it at only pipelines and environments for now. Not everything else. There needs to be something that merges information in the config, with information from the (multiple?) <config-repo> sections. That becomes more complicated as we add more things. For instance, if we allow environments there, what happens if there's an environment with the same name outside. Is it an error? Should they be merged? Etc.

arvindsv commented 9 years ago

I just wanted to note that we should not implement a situation when polled configuration part would update some configuration instance, especially CruiseConfig.

I would use the static config and dynamic config separation I mentioned above. I think server module should be aware of that separation. config-server would only provide urls to configuration repos.

This might become hard to do, given that the rest of the system (for instance the scheduler, material subsystem, the dashboard, the whole admin UI bit) expect to call something like GoConfigService.give_me_all_pipelines and expects to get the current known set of pipelines, so that they can be edited, shown on the dashboard, materials polled for them, etc.

[I'm away for a bit. Will be back and think about this some more]

tomzo commented 9 years ago

expect to call something like GoConfigService.give_me_all_pipelines and expects to get the current known set of pipelines

I noticed that already. I am currently evaluating how much would it take to have GoConfigService that would understand concept of historical configuration. So that there wouldn't be methods like

public boolean isPipelineEmpty()

But rather something like

public boolean isPipelineEmpty(unambiguous definition of configuration at some point in time)

The good news is that both these methods could co-exist. So maybe this can be implemented in some low components and gradually introduced up.

But these are killers at the moment:

public CruiseConfig getCurrentConfig();

And CruiseConfig has

public List<PipelineConfig> allPipelines()

arvindsv commented 9 years ago

I wonder if the some point in time part of "unambiguous definition of configuration at some point in time" can be "now". :) It is unambiguous at that time. It can change over time, and that's ok.

As you say, getCurrentConfig is a killer. I think that's because it's fundamental to how config is, in the system (presently). It is assumed to exist. I hesitate to try and change it, because I feel it is too ingrained to change. With the idea about "now" above, I'm trying to see if there's a way to reconcile the two, bringing in the concept of a default of "now", unless specified otherwise. Just a thought.

tomzo commented 9 years ago

If some global configuration consists only from 3 parts, all them being scms then "unambiguous definition of configuration at some point in time" could be something like first-svn-r432|some-git-hash3344fff5|second-svn-213 . And since current cruise-config.xml is also a git repo then it can be part of such identifier.

I'm trying to see if there's a way to reconcile the two, bringing in the concept of a default of "now"

I guess "now" would be latest commit in each of those first-svn-latest|some-git-latest|second-svn-latest. Is that what you meant?

arvindsv commented 9 years ago

I guess "now" would be latest commit in each of those first-svn-latest|some-git-latest|second-svn-latest. Is that what you meant?

Yes, that's right. The equivalent of HEAD. We can then put all of those together and then see what to do about possible invalidity of the config.

When I think about a config such as:

<cruise>
   <pipeline name="P1">...</pipeline>
   <pipeline name="P2">...</pipeline>
   <config-repo name="abc" url="git://abc"></config-repo>
       <!-- Assume repo abc has two pipeline defined in it "P3" and "P4" -->
</cruise>

then, the "now" (and the "current/latest config"), according to me is a combination of the current config of "P1" and "P2" from this config and the configs of "P3" and "P4" as described by the HEAD commit of repo "abc".

This is what will be used to show the dashboard (assume no pipelines are running), and for polling of materials (code materials, not config materials).

Reading what you said about "-latest" in the previous comment, I think we're thinking the same thing.

arvindsv commented 9 years ago

If I've understood you correctly, the config for your first-svn-latest example looks something like this:

<cruise>
   <config-repo name="abc1" url="svn://abc1" type="svn"></config-repo> <!-- Has first-svn.pipeline or some equivalent. -->
   <config-repo name="abc2" url="git://abc2" type="git"></config-repo> <!-- Has some-git.pipeline or some equivalent. -->
   <config-repo name="abc3" url="svn://abc3" type="svn"></config-repo> <!-- Has second-svn.pipeline or some equivalent. -->
</cruise>

tomzo commented 9 years ago

New design

This is a feature specification, design, and draft of implementation based on all discussions so far. I will update it as we talk.

Features

user can define pipelines and environments in many source code repositories - configuration repositories
configuration in repos may have references to main xml or to other repos. E.g. pipeline in repo A depends on other pipeline in main xml.
user can provide a plugin to interpret contents of single checkout of config repository in any custom way. E.g. pipelines defined in yaml
change in configuration repository triggers pipelines defined in that config repo. As if configuration repo was just another scm material of a pipeline.
if configuration repo is the same (by fingerprint) as one of pipelines scm material then they are treated as one.
scm-config consistency - pipeline running on source code at commit C1 will always use configuration at commit C1 as long as they are in same repository. ~~Moreover rerunning a pipeline on older material revision C2 will use pipeline configuration from that old revision.~~
environments from many config repo sources get merged together with environments from main xml.
pipelines from many config repo sources get merged together with pipelines from main xml. Groups are still valid. Group name can be specified in configuration repo.
cruise-config.xml is always valid by its own when no parts are yet appended. Just like it was so far - meaning this feature is not breaking current xml-config stability.

Limitations

any authorization, security, agents, plugins config, etc. must be stored in main xml
pipelines, environments or any other elements in main xml cannot reference configuration elements from config repositories. E.g. user cannot add a pipeline defined in config repository to an environment defined via UI.
trying to edit in UI a part of configuration defined in repo will return error. However in most cases UI will not allow this edit at all.
config merging policy is hard-coded as specified below. It could be customized in xml.

All these limitations can be removed in future. I will make extra effort so that it can be done later. It just would be too much work now.

Performance

In context of #838 It does not solve the xml performance issues. But there will be less IO on the configuration file because parts of it will be in repositories. Validation is still done at global scope. It must be and it is good for fail-fast approach. However validation is only logical, done at object level so no more revalidation of xsd schema with 1000 pipelines.

Domain model and services

Domain

Configuration repositories are treated almost like scm materials. They are not listed in materials of PipelineConfig but they get scheduled and polled together with schedulable materials.

Remote config concept

There is a new concept in configuration domain - a remote configuration. It can refer to small configuration element or larger section. In general it is ConfigObject + 'config source identifier'. Where 'config source identifier' is its repository address, revision and how to parse it. For example:

a remote PipelineConfig is PipelineConfig + GitMaterialConfig + revision + name of configuration provider
a remote EnvironmentConfig is EnvironmentConfig + SvnMaterialConfig + revision + name of configuration provider
MergeEnvironmentConfig consists of many remote EnvironmentConfigs so its source identifier is a unique list of their source identifiers.

Introducing remote concept has these benefits:

allows to track where each configuration part comes from.
it can be used in debugging - to determine what pipeline configuration is actually being built.
in UI when pipeline, stage, environment etc. is defined from config repo then we can show "Pipeline defined by git url, branch, revision". In case of environment we can show "Environment defined from git1 at revision1, git2 at revision2"

Edit 1

There will be no Remote* classes. Instead

there is ConfigOrigin interface. Implemented by FileConfigOrigin and RepoConfigOrigin.
existing config classes have new member - ConfigOrigin so that we can track origin of each instance.

Merged configs

Configuration classes that merge parts into one. Hiding the parts from the rest of system. Best example is MergeEnvironmentConfig that sums many EnvironmentConfigs and is a an EnvironmentConfig. There will be also MergeCruiseConfig that hides CruiseConfig and remote configurations.

Services

existing pipeline-related services will use these config materials and new services to get PipelineConfig at config material revision instead of asking GoConfigService.
pipeline and environment definitions
there is an extension point for user to define pipelines and environments in source in any format desirable. As long as he writes a server plug-in that will parse a source tree into configuration objects.
under GoConfigService a set of new services will be added that are together responsible for:
- fetching configuration repos
- parsing it
- merging parts together
- validation of configuration as a whole
- presenting configuration in old style as a 'latest valid CruiseConfig' but also in a new style where there is concept of configuration at specified repos revisions.

Implementation details

Xml schema

The <config-repo> will be ConfigRepoConfig class.

<cruise>
  <config-repos>
    <!-- plugin name is optional -->
    <config-repo>
      <git url="https://github.com/tomzo/gocd-indep-config-part.git" />
    </config-repo>
    <config-repo plugin="gocd-xml">
      <git url="https://github.com/tomzo/gocd-refmain-config-part.git" />
    </config-repo>
    <config-repo plugin="gocd-xml">
      <git url="https://github.com/tomzo/gocd-refpart-config-part.git" />
      <configuration>
        <property>
          <key>pattern</key>
          <value>*.gocd.xml</value>
        </property>
      </configuration>
    </config-repo>
  </config-repos>
  <!-- the rest of cruise config, nothing new -->
</cruise>

config-api

CruiseConfig will be larger by possible definition of many remote configuration repos - ConfigReposConfig.
added PartialConfig class. Which is what we agree to be possible to keep in single config repository. Currently limited to list of pipelines with optional group name, also environments definitions. I excluded security so PipelineConfigs class does not fit
added MergeCruiseConfig class. Which inherits the old CruiseConfig but actually consists of parts. Some of its children classes will also be replaced by Merge... classes like:
MergeEnvironmentConfig is EnvironmentConfig . Consists of many RemoteEnvironmentConfig.
~~RemoteEnvironmentConfig is EnvironmentConfig. It has definition of source where it is defined. So that it can be tracked, displayed in UI, forbid edits of that pipeline.~~
~~RemotePipelineConfig is PipelineConfig that is defined outside of main configuration. It has definition of source where it is defined. So that it can be tracked, displayed in UI, forbid edits of that pipeline. ~~
maybe stages, jobs, tasks etc will need such Remote* version as well.

config-server

added PartialConfigProvider interface. This is extension point. Single implementation is responsible for parsing a checked-out directory of configuration repo into a configuration PartialConfig instance.
XmlPartialConfigProvider class. First implementation of above. Can load pipelines and/or environments from .xml files matching a pattern in directory structure.

server

GoConfigService is expected by many services to provide global valid configuration. I will satisfy this expectation by returning a MergeCruiseConfig which inherits CruiseConfig but actually consists of parts. But in the long run I think methods like getCurrentConfig() should be not used. Instead there should be methods that return smaller parts of configuration. Only as much as they really need. There are many places where some service gets entire CruiseConfig but then uses just some section of it. Which is why I will depreciate the big ones. Some calls I will try to remove already
GoConfigService will have a new member ConfigMergeService. It will provide latest MergedCruiseConfig. It is like an adapter from new configuration service model to old so that GoConfigService can work like before. It will also handle configuration updates from UI - by routing to proper part of config, main being the only that works now. It will also trigger extra events for ConfigChangedListeners whenever a new merged config is created.
PartialConfigsService uses implementations of PartialConfigProvider to parse and load PartialConfig instances at specified revision of the config repo.
ConfigMaterialService can provide a checkout of any configuration material at any point in time.

Changes in server services

MaterialUpdateService talks to GoConfigService to get schedulable materials, which are those with auto-polling. I will append configuration materials to that list. Causing existing polling and material update system to produce MaterialUpdated messages with config sources as well.
in BuildCauseProducerService there is WaitForPipelineMaterialUpdate which uses PipelineConfig to get materials which it needs to wait for before pipeline can run. I will add waiting for all configuration materials first.
there are a few services like ScheduleService that ask for PipelineConfig, StageConfig etc. They will now have to become aware of which PipelineConfig to ask for. So that in re-runs they use a PipelineConfig from configuration that was used at that time.

Handling merges and conflicts

Current plan how to handle merging configuration parts.

Environments

Pipelines in environment

Most liberal approach possible:

if any new pipeline name appears then consider it member of environment.
If pipeline name repeats among many configuration parts then just ignore repetition.

Agents in environment

Most liberal approach possible:

if any new agent uuid appears then consider it member of environment.
If agent uuid repeats among many configuration parts then just ignore repetition.

Environment variables in environment

if any new 'variable1=value1' appears then consider it member of environment.
if 'variable1=value1' repeats then just ignore
if some part has 'variable1=othervalue' then it is a conflct

There could be optional overrides but we can consider it future work.

Pipelines

Final pipeline groups get created as a sum of pipelines in groups in partial configurations
if there are 2 pipelines with the same (case insensitive) name then it is a conflict, configuration is invalid.

Authorization can be only in main xml so it cannot conflict when merging.

Work

add a new xml elements <config-repo> - list of config repositories to poll and load. Extend current cruise-config schema.
implement XmlPartialConfigProvider. Perhaps I could use magical loader, if not then parts of it.
implement Merge* and Remote* classes. A lot of small pieces, many tasks but easy. TDD with significant merge cases.
add ConfigMergeService and services below.
update pipeline-related services to use configs at particular repo revision.
add plugin infrastructure around PartialConfigProvider and integration test.

Many of these can be 100% unit tested without much mocking which is good.

Considerations

These are points which I consider still open. So please comment on these.

Why treat config repos as scm material?

@arvindsv You have suggested to leave old pollers doing what they did so far and have a new component to handle config polling.

I can see it as something like this: Existing pollers continue to do what they do. They don't poll config materials. We write different pollers for config materials (using the code for git, svn, etc. already present).

Here are reasons why I chose otherwise:

often the configuration repository material is the same as one of the pipeline scm material. If I had 2 polling systems then it might be that both would poll the same repository for same changes. It would trigger 2 events - config update and material update. With current approach I get just one event.
no need to build another system that does basically the same job.
the problem of waiting for polling of configuration material before the rest of materials still exists in both approaches.

Could you tell me why you think a separate subsystem should exist? I cannot see any benefit of it.

Missing valid config problem

It was mentioned by everybody already. All above seems great until some part of MergeCruiseConfig makes it invalid. There is a number of reasons that can cause it:

missing or broken PartialConfigProvider plugin
cannot clone config repository
last commit simply has invalid content
everything together makes no sense because of conflicts - e.g. the same pipeline name in 2 sources.

The question is What should be current global config when one of the parts or simply merged config is invalid?

I have some ideas but each has its drawbacks:

Always store new valid config - just save new configuration as xml and commit to dedicated branch 'valid-merged-config' in the original Go's cruise-config git repo. This is like Go currently protects against bad configuration being introduced. But it requires to extend xml further so that I could save Remote* and Merge* classes as xml.
just drop partial configs and use main cruise-config.xml which always has some valid version. Most pipelines might be gone temporarily then. But configuration repos are still polled so when fixed config is pushed the 'merge config' will reload.
drop parts selectively, only parts that cause configuration to become invalid.
search history of all config sources to find good one. But there might no such combination so it is not a solution really.

Please vote on these or propose some other solutions

Questions

Design

can I re-use MaterialConfig as part of ConfigRepoConfig ? it seems sensible
any opposition towards CruiseConfig being an interface? Its previous implementation would be BasicCruiseConfig. I need room for MergeCruiseConfig to hide the fact that configuration consists of parts. Other option is that I override 40 methods in MergeCruiseConfig. But I prefer interface because when somebody changes the interface and (wrongly) updates only BasicCruiseConfig then he gets compilation error. Same question for EnvironmentConfig.
is the feature specification with these limitations acceptable?
should I talk to FeatureToggleService if user wants config repos? This feature would be released as beta at first right?
is there anything obvious that I missed above?

General on Go

how to instantiate proper plugin class based on name I provided in config? In the ApplicationInitializer I can see that plugin infrastructure seems ready before config starts loading. So how do I talk to it from a service in server?
who is interpreting all @Config like annotations? e.g. @ConfigTag("environment"), @ConfigAttribute(value = NAME_FIELD, optional = false), @ConfigSubtag are these used by magical xml loader and writer?
how do I run unit and integration tests for config-api, config-server and server on the command line?

mrmanc commented 9 years ago

What should be current global config when one of the parts or simply merged config is invalid?

I have some ideas but each has its drawbacks:

Always store new valid config - just save new configuration as xml and commit to dedicated branch 'valid-merged-config' in the original Go's cruise-config git repo. This is like Go currently protects against bad configuration being introduced. But it requires to extend xml further so that I could save Remote* and Merge* classes as xml. just drop partial configs and use main cruise-config.xml which always has some valid version. Most pipelines might be gone temporarily then. But configuration repos are still polled so when fixed config is pushed the 'merge config' will reload. drop parts selectively, only parts that cause configuration to become invalid. search history of all config sources to find good one. But there might no such combination so it is not a solution really. Please vote on these or propose some other solutions

I feel that any attempt to introduce non-valid configuration should fail; the build importing this configuration from a remote repository should fail and no part of that config should be accepted into the single source of truth. That way there is no chance of any unintended consequences.

tomzo commented 9 years ago

I feel that any attempt to introduce non-valid configuration should fail; the build importing this configuration from a remote repository should fail and no part of that config should be accepted into the single source of truth. That way there is no chance of any unintended consequences.

I just want to make sure if I understand you correctly. So lets say there is a config that has 3 parts. It is a valid configuration. Now when one part causes configuration to be broken what do you expect from server?

to display a big fat error and continue operations on valid confguration from previous 3 parts?
to display a big fat error and stop scheduling any pipelines until valid config is introduced?

mrmanc commented 9 years ago

So lets say there is a config that has 3 parts. It is a valid configuration. Now when one part causes configuration to be broken what do you expect from server?

to display a big fat error and continue operations on valid confguration from previous 3 parts?

to display a big fat error and stop scheduling any pipelines until valid config is introduced?

Are the three parts three different pipelines? Unless I’ve misunderstood, config repos would be attached to the pipelines which they define. If that is the case then I think displaying a big fat error and stopping scheduling the pipeline that that config defines would be the best option.

tomzo commented 9 years ago

Are the three parts three different pipelines?

Yes and they may be inter-dependent in most unpleasant way. Single config repo may define many pipelines. But any single pipeline is of course defined in just one repository.

If that is the case then I think displaying a big fat error and stopping scheduling the pipeline that that config defines would be the best option.

This case implies a new concept of not-entirely-valid config.

Moreover if there are pipelines like A -> B -> C, each defined by different config repo, a,b,c respectively. Then when config repo b introduces a broken config then we have to stop scheduling C as well. Or if repo b removes pipeline B from config then repo C becomes invalid.

It is doable but I want you to notice these facts.

mrmanc commented 9 years ago

Yes and they may be inter-dependent in most unpleasant way. Single config repo may define many pipelines. But any single pipeline is of course defined in just one repository.

Are you sure that’s required? The only reason I can imagine that someone would want to do that is if they are trying to simply move a massive config somewhere else. It seems to me that it would be a much easier problem to solve if a config repo could only define the pipeline which it is attached to.

This case implies a new concept of not-entirely-valid config.

Moreover if there are pipelines like A -> B -> C, each defined by different config repo, a,b,c respectively. >Then when config repo b introduces a broken config then we have to stop scheduling C as well. Or if repo b removes pipeline B from config then repo C becomes invalid.

Wouldn’t config repo b introducing a broken config simply mean that pipeline B would stop scheduling until config repo b contains valid configuration? Pipeline C could carry on scheduling any builds caused by other materials, because the non valid config would not have been imported.

It is doable but I want you to notice these facts.

Thanks, you’re putting a lot of thought into this :) I hope my comments aren’t coming across as critical!

tomzo commented 9 years ago

Are you sure that’s required? The only reason I can imagine that someone would want to do that is if they are trying to simply move a massive config somewhere else.

Well, as a user I am not going to create such dependencies. They are obviously more error prone. But I will consider just adding a limitation to not support such complexity. It might make some things easier.

It seems to me that it would be a much easier problem to solve if a config repo could only define the pipeline which it is attached to.

Yes it would be easier. Anyway my recommendation to using this feature is to keep all inter-dependent pipelines in single config repository so that they can be fully validated before pushing.

Pipeline C could carry on scheduling any builds caused by other materials, because the non valid config would not have been imported.

It seems it could. I'll have to think about it. But there is still a case of some dependency disappearing.

mrmanc commented 9 years ago

It seems to me that it would be a much easier problem to solve if a config repo could only define the pipeline which it is attached to.

Yes it would be easier. Anyway my recommendation to using this feature is to keep all inter-dependent pipelines in single config repository so that they can be fully validated before pushing.

I suppose that would help with occasions you want to rename a pipeline, so that you can rename it wherever it is referred to. I think if we kept all our inter dependent pipelines in one config repository then we would have around 600-700 all in one place, which is no more manageable than what we have now.

If you kept the definition of named dependencies separately from the stages, jobs and tasks that make up a pipeline, then you could have the dependencies stored and edited via Go (i.e. not possible to store remotely) and the contents stored remotely alongside the project they are building… just a thought.

It’s not an easy problem to solve! I’m sure you’ll find a decent solution. There must be a way of failing fast and maintaining integrity.

zabil commented 9 years ago

Great work @tomzo @arvindsv

A question.

How are we planning to handle config migrations?

Go currently uses schema versions to migrate the config allowing users to start using new configuration/features etc. With the pipeline config under version control how is that going to be done? I suppose we can have logic in pipeline config plugins to understand versions, but that means to use new features the user will have to manually update the version as we don't want (or want?) Go making commits, to the repos mentioned here.

<cruise>
   <config-repo plugin="my-config-repo-parser">
     <!-- source is defined using the same format that materials use -->
     <git url="" branch="" ></git>
   </config-repo>
   <config-repo plugin="my-config-repo-parser">
     <hg url="" ></hg>
   </config-repo>
  <!-- the rest of cruise config, nothing new -->
</cruise>

tomzo commented 9 years ago

How are we handling config migrations?

I do not see it other way than that developer of the parsing plugin is responsible for maintaining the schema and upgrades in the configuration repos. I think it is up that plugin developer if it will be capable of returning a sensible configuration part to the Go or if it will throw and Go must treat current config part as invalid until a commit with fix/upgrade is introduced. Which leads to the problem of handling invalid config part properly.

...allowing users to start using new configuration/features etc.

Can you provide an example of new feature that requires updating pipeline or environment config?

tomzo commented 9 years ago

we don't want (or want?) Go making commits, to the repos mentioned here.

I wouldn't exclude that option. Editing configuration in repo via UI might be possible later. Go would commit on user's request. With what I had planned so far it will be possible to track exactly where each part of config comes from so it possible to tell where modification(s) should be commited.

zabil commented 9 years ago

I think it is up that plugin developer if it will be capable of returning a sensible configuration part to the Go or if it will throw and Go must treat current config part as invalid until a commit with fix/upgrade is introduced. Which leads to the problem of handling invalid config part properly.

This is handled automatically now. If it is to be manual I guess we need to provide specs to allow users to migrate. Potentially frustrating? The expectation is be to adopt the current schema.

Can you provide an example of new feature that requires updating pipeline or environment config?

Meant adding support for something new in the configuration e.g. authorization for templates. Although it does not involve changing the pipeline the plugins need to be aware about new configuration tags (if it's converting from a specific format).

arvindsv commented 9 years ago

This is handled automatically now. If it is to be manual I guess we need to provide specs to allow users to migrate. Potentially frustrating?

It can be manual (user), or it can be automatic (plugin author). Either way, Go cannot do this any more, because the format of the definition of the pipeline is no longer is Go's control. I mean, it need not be XML any more. So, it'll be an invalid config, unless it is changed.

What we can do, from a Go config migration side is to try and provide good defaults, so that a config does not necessarily become invalid automatically. That's the best I can see us being able to do. What do you all think?

[I need to respond to the rest. I will soon]

arvindsv commented 9 years ago

First part of answer:

Moreover rerunning a pipeline on older material revision C2 will use pipeline configuration from that old revision.

This causes a behavioral inconsistency. If you rerun a pipeline which is defined in the main config, then it will rerun with latest config. However, if you rerun a pipeline defined in a repository, it will rerun with old config. My worry is that this means that there are now two different kinds of pipelines. During build creation, Go will now need to know that this is a pipeline that is defined in a repo (I see you've taken care of it in the RemoteConfig concept) and will have to go to the plugin and ask it to give a version of the config that is potentially old.

The rationale for the decision to use the latest version of the config is Go, as against an old version, is migration. Taking a config from an older version of Go, and trying to use it with a newer version of Go, without migration will cause problems. This might need more thought. The core of Go will not be able to work with multiple versions of an object. I hope that makes sense. If version 75 introduces a new field in the config called "blah", then the API between the config repo plugin and the Go server will include "blah" as a field. Now, if the user reruns a version 70 build, then the plugin has to somehow migrate the config from version 70 to 75, and put in a "blah" field in the response.

Thoughts?

... so no more revalidation of xsd schema with 1000 pipelines.

Unless the user happens to have a 1000 pipelines defined in the old XML format (which remains valid).

Remote config concept

Does it make sense to change all the current configs to have the extra information? In a sense, the existing config classes have the same information, but it is implicit (provider = Go, GitMaterialConfig = Go's XML file repo, etc). I'm thinking of consistency of the objects rather than extending through inheritance. These seem to be extensions to add behavior (if I want to be pedantic, I could argue that it violates LSP).

Merged configs

As far as I understand, it's a composite. I'll have to probably see code to see how this would work in this case. To me, there is a lot of code which expects an EnvironmentConfig to represent one environment. To make it represent many, so that the existing code will continue to work seamlessly will be tough. I'd rather let the core know that there are many environments (in memrory, objects) rather than trying to hide that. It might just be too tough.

It should be enough to be able to track that the environment was defined in repoX, which is handled by pluginY and that anything related to that should be asked to that plugin.

However, I'm willing to wait and see how this work. My gut says it'll be more complicated than what I say above. But, I could be wrong.

I think this comment of mine continues, when I see the proposed config-api changes. My intuition is that keeping the changes here as small as possible will reduce work. MergeEnvironmentConfig is a new class, and along with it, it will bring new behavior all over (I know it will pretend to be an EnvironmentConfig). However, if we keep everything as EnvironmentConfig at this level (with possible extra information), it might be easier.

config-server

Looks ok to me.

server

Instead there should be methods that return smaller parts of configuration. Only as much as they really need.

I agree.

The rest of the server section looks good. My earlier question about version + migrations comes back here. How is an old version of the config on the filesystem handled by the plugin, when asking for a specified version?

Handling merges - Pipelines

I assume that the ones in the main config are the initial list of pipelines, groups and environments into which the ones from the repos are added (if they're valid).

How are we handling atomicity? You mention that a wrong pipeline config from the repo can cause the config to be invalid. Will environments be added already? I'd assume a whole new version of the repo will either be valid and added, or invalid and not merged into the main config?

arvindsv commented 9 years ago

Work

Don't forget that there is Rails code which needs to list all the pipelines, so that they can be edited, etc. So, some of the services are used there as well.

Considerations: Why treat config repos as scm material?

We could use the pollers. My only issue with that was that the commit seen by the pollers would get into the DB. But, I suppose that's ok, since the same commit might be needed by a code commit (as you mention in the "reasons" part).

So, I agree with you. It looks like there is no benefit.

Considerations: Missing valid config problem

My vote is for dropping a whole partial config (the whole <config-repo> section) when it is invalid. The others are too much work, for too little benefit. As long as we make it visible somehow that the latest commit is invalid, then it's fine.

However, there is a bit of a corner case:

Everything is fine.
Invalid commit into config-repo.
Server is restarted.

As soon as it comes up again, what does it do? If it tries to use the latest config from the <config-repo>, then it is invalid. Does it not show those pipelines at all? What does it do when it is not restarted? It continues to show old pipelines from the last valid commit? If so, then the concept of a good, valid config, should exist.

If a new valid config is always stored (with the merged and remote parts in the main config), that would solve the restart question. However, I feel that it is a lot of work, again for too little benefit.

I see it not showing the pipelines, if the config is invalid when it (re)starts. That's ok, IMHO.

arvindsv commented 9 years ago

Actually, adding to what I said just now, I see @mrmanc had asked whether allowing definition of pipelines and dependencies across config repos is necessary. It's a good question. I wouldn't mind not allowing it. It would reduce some complexity. What do you think?

Even you, @tomzo, said:

Well, as a user I am not going to create such dependencies. They are obviously more error prone.

arvindsv commented 9 years ago

Questions: Design: Reuse of MaterialConfig, CruiseConfig, EnvironmentConfig

I think you can re-use them if you want. You can even convert it to an interface, if you want. As long as everything else works, I don't see any reason to say no.

However, as I mentioned earlier, I would consider extending PipelineConfig and EnvironmentConfig, to introduce information about whether they came from, instead of creating sub-types (or siblings in an interface), since they don't seem to have such different behavior. To me, it seems like they would be in an inheritance relationship, just so that they can share the interface, with one of them having extra information, and potentially extra behavior.

Questions: General on Go: how to instantiate proper plugin class based on name I provided in config

You don't need to instantiate a class. All of that is done by the framework when the plugin is loaded. All you need to do is to send it a message as I mention below. It will be sent to the plugin, and you'll get a response. The plugin will have a class with the @Extension annotation and that's where this message will be sent.

You can use the AbstractExtension. Here is an example using the NotificationExtension. The getNotificationsOfInterestFor() method is used to send a message (JSON string) to the plugin represented by "pluginId", to ask it which notifications it would be interested in receiving. The request body and params sent to it are empty. There is an expected response from the plugin (a JSON string), which is parsed and converted into a list of strings.

So how do I talk to it from a service in server?

Take a look at PluginNotificationService and how it uses a registry, which finally uses the NotificationExtension class mentioned above.

who is interpreting all @Config like annotations? e.g. @ConfigTag("environment"), @ConfigAttribute(value = NAME_FIELD, optional = false), @ConfigSubtag are these used by magical xml loader and writer?

Yes, they are. It makes me sad and angry how it does that. I don't like to swear, but looking at it, I sometimes do. Too much reflection, in my opinion.

how do I run unit and integration tests for config-api, config-server and server on the command line?

Do this at the top level:

./bn clean cruise:prepare

Once that is done, you should be able to use mvn test in the corresponding directories. Let me know if you have trouble with this.

[I think I've answered all the questions. I'll edit this final comment with some small changes for the last question, if I need to]

gocd / gocd

Feature: pipeline configuration from source control #1133

Overview of the feature

Example configuration repositories

Domain and concepts

Configuration repository

ConfigOrigin

Base and Merged configuration

Behavior and assumptions

Significant cases

When configuration repository that defines the pipeline is the same as one of materials

When configuration repository that defines the pipeline is not one of materials

Failures

Hung material

Failed parsing

Handling merges and conflicts

Environments

Pipelines in environment

Agents in environment

Environment variables in environment

Pipelines

System

Services

Below GoConfigService

Above GoConfigService

New material update queue

Unloading queues

ConfigMaterialUpdater - new component

Final service notes

Reuse pollers directories

Handling edits

Adding

Removing

Modifications

Saving changes

Pull requests

1276 - (merged) - domain changes

1330 - (merged) - new xml schema

1331 - (merged) - services below GoConfigService

1332 - (merged) first (internal) plugin which allows to define configuration in repository in XML

1333 - (merged) new MDU workers and second material queue

1536 - (merged) ensures scm-config consistency

1810 - UI changes to disable editing remote configuration. + #1827 to improve user experience

1825 - configrepo extension point

Original post from May 2015

Motivation

Problem

Concept

Final notes

My idea of work towards new config implementation

Pipeline configuration as material?

Validation

Config as material

Next steps

PipelineConfig class

PipelineGroups class

PipelineConfigService class

@autowired, @service and @component

Current git repo for cruise-config.xml

Polling

PipelineConfig class

PipelineGroups class

PipelineConfigService class

@​autowired, @​service and @​component

Current git repo for cruise-config.xml

New design

Features

Limitations

Performance

Domain model and services

Domain

Remote config concept

Edit 1

Merged configs

Services

Implementation details

Xml schema

config-api

config-server

server

1825 - `configrepo` extension point

`@autowired, @service and @component`

@autowired, @service and @component