gocd / gocd

GoCD - Continuous Delivery server main repository
https://www.gocd.org
Apache License 2.0
7.05k stars 973 forks source link

support whitelist for SCM changes #436

Closed srinivasupadhya closed 7 years ago

srinivasupadhya commented 9 years ago

on similar lines of blacklist feature but with opposite functionality.

baskaran-md commented 9 years ago

+1 for the feature request.

pospispa commented 8 years ago

+1 for the feature request

scottmuc commented 8 years ago

+1 !!!

martinpapez commented 8 years ago

+1 for the feature request

mainar commented 8 years ago

+1

zabil commented 8 years ago

Can someone make a case for scenarios that'll require this? Will help with context and design when someone picks this up.

scottmuc commented 8 years ago

@zabil I'm looking at doing this. Just downloaded the Vagrant box on the weekend! This is something our project needs big time. We'll see how I do considering I'm pushing past post-technical ;-)

Here's my use case:

We have a repository called runtime-stacks. Inside of it we have a directory that creates debian packages and docker images. Here's an example of what we have:

base
jdk (contains jdk 6,7,8 and jruby)
ruby (contains 1.9, 2.0, and 2.1)
aws-cli (contains all the stuff necessary to perform AWS commands)
fpm (a light container that just has fpm)
go-1.4 (contains cross compiler)
...
and many more

We do not want all of these rebuilt because a single change happens. If I make a change in base, I want the base image to change and our pipeline will kick off the pipelines for the other runtimes. If I make a change to jdk, I only want the jdks to rebuild.

We have another repository that contains the automation to create debian packages and docker images for our core infrastructure apps. Inside it contains:

ldap-server
apt-repo
dns-server
npm-registry
go-cd
jenkins
base-ami
...
and others

Whenever a change happens in one of those directories, we want to rebuild the image and deploy it in a pipeline.

Once the apt-repo poller works for custom repositories we'll have a pretty slick workflow for infrastructure changes (eg: a new base debian package should trigger nearly everything to rebuild). For that we'll be looking at elastic agent automation.

We have all of this working in Jenkins, but the pipelines are terrible. We're hoping our elastic slave code can be ported to work with go.

zabil commented 8 years ago

@scottmuc :metal:

awarddev commented 8 years ago

+1

alexouzounis commented 8 years ago

hey guys, Any progress on this ? Would love to help if someone has already started work.

lenucksi commented 8 years ago

:+1: That would really be useful.

@zabil Could you help with pointing out what one would possibly need to look at to add this? A initial pointer into the source would be appreciated.

chrislovsund commented 8 years ago

+1

sjeri commented 8 years ago

+1

arvindsv commented 8 years ago

If you're looking to work on this, you probably want to start with this and its tests.

Of course, there will have to be a config migration, when you decide how the whitelist should be, in the config. You'd need to decide whether you want to allow both blacklist and whitelist, and if so, what the behavior should be. Finally, a UI change somewhere around here.

lcs777 commented 8 years ago

+1

agsmorodin commented 8 years ago

Isn't it possible somehow to use blacklist for this purpose? Like blacklist everything except whitelisted directory? Or maybe someone found a solution for this?

scottmuc commented 8 years ago

@agsmorodin yes, but it's painful. In my scenario, every time I added a directory to the repository, every other pipeline needed to update their blacklists. I even wrote a shell script to do this

#!/usr/bin/env bash

set -e

included_dir=$1

if [ -z ${included_dir} ]; then
  echo please specify the path you want to include
  echo
  exit 0
fi

pathes=( $(ls . | grep -v ${included_dir}) )

for i in "${pathes[@]}"; do
  if [ -d ${i} ]; then
    echo "${i}/**/*"
  else
    echo $i
  fi
done | xargs | sed 's/ /,/g'

You would then copy+paste the result into the materials blacklist field.

chrislovsund commented 8 years ago

We have also resorted to using the blacklist, but as @scottmuc is saying, its a lot of maintenance. :( We have a script that generates the xml, then copy paste into xml config.

argha-c commented 8 years ago

This is a much-needed feature for our organisation at this point. There are hundreds of builds and we prefer mono-repos for keeping related concerns in one codebase. However, there are multiple independent components/services in a single repo with their own pipeline. Not having this feature means we are forced to blacklist (n-1) directories for each of the 'n' components. This is cumbersome and needs change whenever a new component is added.

dajulia3 commented 8 years ago

+1 on this feature request!

lobsterdore commented 8 years ago

I am in need of this feature as well, I've got a repo that builds a large amount of apps, at the moment commits to this repo kick builds pipelines for each app. It would be good to have a whitelist to limit which builds get kicked off, as mentioned above a blacklist is simply too much maintenance.

ravenscar commented 8 years ago

I'm actually at a loss to imagine how people use the blacklist with any non-trivial repository. Does go expect us to have a different git repo for every single pipeline which uses git?

It seems that if the blacklist was left as a regex instead of the half-baked globbing code that it is now it would be a lot more useable, e.g. we could use a negative lookaround in the regex to essentially form a whitelist.

It seems that either allowing an actual regex for the blacklist or allowing a checkbox to negate the results of the blacklist matching would solve the problem most people here are having.

arvindsv commented 8 years ago

It seems that either allowing an actual regex for the blacklist or allowing a checkbox to negate the results of the blacklist matching would solve the problem most people here are having.

I don't mean to be very rude, but as mentioned in other tickets, this is still open-source, and still accepting PRs. :)

Here's what needs to be done, for anyone interested in picking this up.

I agree it is important and would be valuable, but no one, as far as I know, is working on it right now.

ravenscar commented 8 years ago

Sure, also not trying to be rude although re-reading my reply I can see how it may seem that way. I've spent half a day looking at the code in this area, searching for a workaround, and now that I can't see one I'm looking at doing a fix next week.

However it makes me think that maybe go might not be suitable for complex projects within the same repo. I'm wondering how others cope with having 50 pipelines with blacklist rules, then somebody adds a new folder in the project root. Do they really update all 50 sets of rules?

I assumed I was missing some obvious feature on the material level for SCM such as a base directory to watch for changes within. It seems that feature doesn't exist.

lenucksi commented 8 years ago

@ravenscar sounds good, I am eager to see your solution. I have a whitelist variant lying around however proper UI and test integration of the aforementioned checkbox is not done yet.

donaldguy commented 8 years ago

Our monorepo is a little less strictly subdired than @scottmuc's ; Here is my incredibly naive and likely to be badly performant for large repos whitelist to blacklist convertor

require 'set'
paths = Set.new(Dir.glob('*'))

ARGV.each do |whitelisted_path|
  wl_path_components = whitelisted_path.split('/')
  0.upto(wl_path_components.length-2) do |i|
    paths.delete(wl_path_components[0..i].join('/'))
    paths += Dir.glob("#{wl_path_components[0..i].join('/')}/*")
  end
  paths.delete(whitelisted_path)
end

puts (paths.map do |f|
  if File.directory? f
    "#{f}/**/*"
  else
    f
  end
end.join(','))
lenucksi commented 8 years ago

It turns out that allowing the server to be entirely whitelist or entirely blacklist by VM parameter is sufficiently easy and lies around here in a working condition. It needs a few tests and clean-up to the changesets plus some non-code related stuff. Maybe even passing through entire regexes might be possible, however, I did not look at that closely.

As for the GUI parts: It turns out the integration of a checkbox within the Ruby/Rails parts is basically doable, however passing through the value to the relevant object might be a bit more fuddling around the code.

ravenscar commented 8 years ago

@lenucksi

I also tried changing a server from blacklist to whitelist by flipping the return value of Modifications.shouldBeIgnoredByFilterIn() but unfortunately this is a server wide change (e.g. all scm materials become a whitelist, no blacklist). It's simple enough to do but requires us to build a custom version every time we wish to upgrade, which is a pain.

Regexes aren't easily possible as the code that does the globbing escapes all non-alpha characters except for * and /. So it's likely allowing regexes will break existing builds.

I decided to add an additional boolean to ScmMaterialConfig allowing the filter to be inverted on a per-material basis, essentially turning the blacklist into a whitelist for a specific material, defaulting to a blacklist and therefore not breaking compatibility with existing installs. What I've done is add a checkbox under the blacklist filter in the material's config: image

I wanted to get this done sooner but I was hospitalised needing a knee-reconstruction four weeks ago, so I had difficulty getting around to it.

lenucksi commented 8 years ago

👍 Very eager to try this out!

FelipeZini commented 7 years ago

+1