Closed madhurranjan closed 4 years ago
:+1: It would be great if the schedule api returned something to reference the pipeline run.
At the moment, we are passing in a guid as a param to the scheduled build, and then polling until we find a build with that guid in it. Very hacky, but it actually works very well.
I remember something similar was present earlier
I don't think it was ever there. I agree it would be nice. But, why I think it was never there, is because the schedule pipeline API calls the material update subsystem, which goes away and tries to poll materials. Once that finishes, the pipeline gets scheduled.
The API call does not wait for all this, and so, it cannot know the scheduled pipeline details. At this point, I'd like to drag in @sriki77 (kicking and screaming) into this conversation and ask him to implement this, since he knows how. :) It's way past the time he should contribute.
@matt-richardson
we are passing in a guid as a param to the scheduled build, and then polling until we find a build with that guid in it.
Hello Matt,
I have the same problem as everyone with #990 and I am trying to work around it doing the same thing your are via the API. I would love to hear how you are accomplishing that!
Here is what I have found so far:
If I set an environment variable:
If I set a parameter:
So if you can elaborate on:
That’d be extremely helpful! thanks ;)
Hi @drmuey
We trigger the schedule API, passing the variable as you have suggested about.
Then we poll the piplines stages API, then for each stage, get the job detail, and then we get the console.log artifact for that job. Then we check that for the guid that we passed in as the variable.
Long, painful, but it works.
It would be super nice if the schedule api gave back an id so that we could poll on that, but I suspect that would involve some architectural changes..
Hope that helps!
@matt-richardson thanks it does, I think the polling I'll end up with will be looking through pipline history for materials w/ the SHA we know about, once we have the count number for that we have an exact build to act on from that point on.
This needs someone to work on it. Given the way it is implemented now, this can't be added easily, since it is asynchronous. I think the way to do this is to return a guid with a 201/202 response, and then provide an API to check the status of that guid. When the pipeline is triggered (or fails to trigger), the status will give information about the build number, etc. and can be used to do further operations. As I said, needs someone to try and contribute this change. It's not planned at this point.
@arvindsv What would it take to include the environment variables that were sent to the scheduler in the history data? If you can point me in the right direction, per your offer in #1417, I'll see about a pull request to do it. Should this be a different issue?
That would be very simple and very robust because it allows:
e.g.
if( build.scheduled_with{'variables[MyUniqTag]'} == tag_i_am_looking_for ) { # tag_i_am_looking_for is 'Derp101' in our example
return build.counter; # this stops the looping
}
@drmuey: Sorry for the delay. I'll respond tomorrow.
@drmuey: I took a look at making that change. I don't think that is the "right way"™ :) To me, that seems like changing the history API to provide environment variables, as a bit of a hack. If we decide to put environment variables there, there are many other pieces of information which might be relevant too.
Having said that, if we decide to go ahead with something like that, ideally it should have been be as simple as adding a line like this here:
@variables = Hash[buildCause.getVariables().map {|env_var| [env_var.getName(), env_var.getDisplayValue()]}]
However, since the history call can be potentially non-performant, it has been carefully created to load only what is required to create that JSON. In this case, environment variables didn't make the cut. So, they're not loaded from DB at all.
If you trace calls starting from here, through here, through here, through here and finally here, you'll see that, though material revisions (commits/pipelines which led to this build) are loaded on to the build cause, the environment variables are not. An extra call will need to be made, to load environment variables. Probably this method. Once that is done, the earlier change I mentioned should work (the one in build_cause_api_model.rb
).
I think a better way (since we're talking about fixing it) is how asynchronous APIs are usually handled:
This is not extremely hard, by the way.
@arvindsv that would be awesome! (and glad to hear its not difficult)
I'd be happy to help though my lack of familiarity w/ the architecture would probably hinder you, perhaps I can donate for the effort in some way though? Let me know the best way I can help you, perhaps some free-as-in-:beer: donations ;)
… as a bit of a hack. If we decide to put environment variables there, there are many other pieces of information which might be relevant too. … However, since the history call can be potentially non-performant, it has been carefully created to load only what is required to create that JSON. In this case, environment variables didn't make the cut. So, they're not loaded from DB at all.
If the ideal version is too much at this time what would you think of this:
For performance and non-hackiness and simplicity of change: What if we just included one variable, say, MY_EXTERNAL_ID that could be passed in at schedule time (variables[MY_EXTERNAL_ID]=whatever
) for anyone wanting to be able to find the build they started in history.
My $0.01 is that finding a build by a variable being passed in is pretty round about and a more correct version would be to update go to return a url based on a uuid or some other identifier. As a user, this is what I would want and expect from the API, anyway.
Graham
On Mon, Sep 7, 2015 at 10:21 AM, drmuey notifications@github.com wrote:
… as a bit of a hack. If we decide to put environment variables there, there are many other pieces of information which might be relevant too. … However, since the history call can be potentially non-performant, it has been carefully created to load only what is required to create that JSON. In this case, environment variables didn't make the cut. So, they're not loaded from DB at all. If the ideal version is too much at this time what would you think of this:
For performance and non-hackiness and simplicity of change: What if we just included one variable, say, MY_EXTERNAL_ID that could be passed in at schedule time (
variables[MY_EXTERNAL_ID]=whatever
) for anyone wanting to be able to find the build they started in history.Reply to this email directly or view it on GitHub: https://github.com/gocd/gocd/issues/990#issuecomment-138310866
@grahamc agreed, keep in mind though that that is what we have to do now anyway except it requires a custom stage/job and downloading and parsing of the log file (which can take days to start even its a simple echo job) to find our variable “tag”.
I suggested the single variable compromise because I suspect a simple targeted one line change is more likley to get done than an overhaul of the scheduling API.
Either way is way better than the current state with is über “round about” ;). If you have any ideas we've overlooked feel free to chime in!
Another tack would be to add the thing we want to look for to the label but ATM pipeline labels:
Perhaps a feature allowing use of environment vars in pipeline labels would be a simpler way to solve this problem?
I'd be happy to help though my lack of familiarity w/ the architecture would probably hinder you, perhaps I can donate for the effort in some way though? Let me know the best way I can help you, perhaps some free-as-in-:beer: donations ;)
@drmuey: :beer: is easy to come by. :watch: (umm, time. Not a watch) is not, unfortunately. :) If you're willing to learn the architecture and take a stab at developing this, I'll try and find someone from the team to help you initially and when you get stuck. If no one is around, I can usually help. But, me trying to do this myself (especially given lots of :beers:) is not going to work. I am already caught up in too many different things.
@arvindsv lol, I totally get that ;) I'm in the same boat but let me see what I can sort out
Another approach would be to support tags in the schedule API: POST …/schedule
w/ materials[my_git_material_nam]=v1.2.3.4
.
That should mean that we can find the build number by revision (which is reliable and comparitvely simple). It would also have the boon that we wouldn't have to have a job that set up the repo before doing anything else.
Would that be a thing go.cd would accept via pull request?
I saw some discussion on it but no issue, should I create one or use this one?
Update to “Another tack would be to add the thing we want to look for to the label but ATM pipeline labels”:
Decide on a special environment var like EXTERNAL_ID
or, perhaps, GO_EXTERNAL_ID
to ensure it does not clobber anyone’s current use of EXTERNAL_ID
(YAGNI?).
Add names.add("env.EXTERNAL_ID");
to gocd/config/config-api/src/com/thoughtworks/go/config/PipelineConfig.java before or after line 477’s names.add("COUNT");
Then a pipeline would:
EXTERNAL_ID
to its Env Var list (not needed if its a GO_ one but GO_one can't be passed in to schedule API right? nice to avoid complication)1.2.${COUNT} (${env.EXTERNAL_ID})
Which would mean:
Would this simpler, less invasive, lightweight change be an acceptable alternative approach?
For us itd still require the use of jobs to checkout the right tag first but if “Another approach would be to support tags in the schedule API” is too complicated to do very soon this would make a nice simple alternative.
I think doing something special just for this (including using tags and external_id) feels hacky. Thoughts about the two approaches:
I'm very happy you're looking at code. :) @juhijariwala has offered to help you, give you some context and show you around the code. I wouldn't mind if you picked a more general version of your approach 2 (handling all environment variables) or the approach I mentioned earlier (return a token-URL to check). I prefer the token-URL approach, since it is direct and solves the problem, whereas the environment variables in a pipeline approach is indirect, and allows you to handle your problem through a different channel (parsing an unrelated API response). But, as I said, either one works for me.
A few more thoughts on this:
The original request is for a mechanism that returns the label of the pipeline, so the user can query based on that label to get the newly created pipeline. Unless I'm interpreting the thread incorrectly, what is actually useful in the end is to know the URL of the pipeline.
I think @arvindsv's original suggestion makes the most sense:
I think the way to do this is to return a guid with a 201/202 response,
and then provide an API to check the status of that guid. When the
pipeline is triggered (or fails to trigger), the status will give
information about the build number, etc. and can be used to do further
operations.
The rest of the techniques here are just patches over the inability to tie a triggering to a pipeline:
EXTERNAL_ID
), I would rather see full support for environment variables in labels. That said, I'm still insisting that the desired feature here is actually to return the URL of the build.I think a lot of great feature suggestions have come out of this thread, but I worry about implementing many of these just as an means to an end. I would rather see each of these features being wanted out-right before implementing them.
Here is some code I would like to use:
import requests
import time
scheduling = requests.post('server/go/api/my_pipeline/schedule')
check_url = scheduling.headers['Location']
pipeline_url = None
while pipeline_url is False:
status = requests.get(check_url)
if status.status_code == 200:
pipeline_url = status.headers['Location']
else:
time.sleep(status.headers['Retry-after'])
my_pipeline_data = requests.get(pipeline_url)
@grahamc you're right its not necessarily the label we want per se but rather the id (AKA the COUNT). With the id we can determine any number of URLs and what not. Of course if we had a Location header (or the label etc) to parse then we could determine the counter for that build. I don't really care about the mechanism to find the id (unless its incredibly delayed or fragile like the console.log parsing approach I tried), I just need the end result for the go.cd API to be of any use.
The other solutions I proffered were just ideas that were smaller in scope in the hopes that it'd be easier to see–to–fruition than the The Right Way which will likley take much longer.
Right. When the URL to the pipeline is ultimately returned, other URLs can be easily returned, if they aren't already in the newer API definitions.
Running into this need now too. I really just need something back from the schedule POST that gives me something to poll. Messing with VARs, or scanning logs isn't really a solution.
@grahamc is spot on. Is there any update on this front?
Pleading ignorance...but I'm wondering why the schedule POST can't return the next-in-line build number(pipelines/:pipeline/instance/#) with the 202. If the schedule was accepted then it should be available right? Why not wait for the 'materials loop' to finish to return the 202?
@grahamc has the right idea. I'm very interested in seeing this feature implemented. Any updates on this? It has been almost a year.
This is completely Horror. The schedule API must return something useful to access the pipeline result.
This issue has been automatically marked as stale because it has not had activity in the last 90 days.
If you can still reproduce this error on the master
branch using local development environment or on the latest GoCD Release, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.
Hi,
Lets say I've instantiated a pipeline via the POST call. The response should tell me the pipeline label that has been kickstarted and I should be able to query on that pipeline label. I remember something similar was present earlier . Can you tell me if this requirement can be met with the current list of apis ?
Thanks