KSP-CKAN / CKAN

The Comprehensive Kerbal Archive Network
https://forum.kerbalspaceprogram.com/index.php?/topic/197082-*
Other
1.99k stars 348 forks source link

CKAN CI - Jenkins #1247

Closed techman83 closed 4 years ago

techman83 commented 9 years ago

The host that has the original Jenkins Infrastructure on it has gone AWAL. No idea what happened to it and have no way to really tell right at the moment. ( @hakan42 @AlexanderDzhoganov if you're around.. halp!)

We have infrastructure on AWS that just needs a little more configuration and it should be good to go, which I'll get onto today. (I was at a big open data competition over the weekend, so didn't get a chance to look into it).

Many thanks to @dbent for stepping in as a pseudo Jenkins - your amazing work has kept the updates flowing.

Things to be completed before this is considered closed:

techman83 commented 9 years ago

Well there is a new Jenkins... That although isn't configured to, commenting on ALL THE THINGS. I've no idea why.

But it appears to building things, so in theory will start passing new builds. It doesn't appear to be updating statuses yet.

hakan42 commented 9 years ago

Out of curiosity, what is the address of the ALL NEW JENKINZ ?

pjf commented 9 years ago

@techma83: Updating statuses is done with some sort of webhook magic. If for some reason you don't have the permissions to set that on the github side of things, then let me know and I can likely do so.

pjf commented 9 years ago

(Also, you rock; thank you!)

techman83 commented 9 years ago

@hakan42 -> http://52.10.41.70/ @pjf - I've used my account to add the oauth token for the webhooks until I get a chance to sort out the Bots access.

It's probably just missing some configuration as I've whittled away most of the errors in the logs. I've no idea what the go is with the whitelist yet, but I was pretty worn out from GovHack yesterday.

There was a bug in the docker stuff which caused a herd of yaks to decend upon me yesterday jenkinsci/docker-plugin#262 - which appears to have a release coming out soon.

I was intending to look at the old CI box to grab the config info, but it's still MIA.

techman83 commented 9 years ago

Latest version of the docker plugin sorts the issues with it. It now spawns and terminates the containers on demand. I've also turned off the whitelist requirement.

We run as a local user without sudo access, so in theory if a malicious person submitted a dodgy pull request it wouldn't be able to break out of the container. But I'm no docker expert.

techman83 commented 9 years ago

So I think the GitHub Pull Request builder has a bug in it too!

INFO: NetKAN #42 main build action completed: SUCCESS
Jul 08, 2015 8:42:56 AM hudson.model.listeners.RunListener report
WARNING: RunListener failed
java.lang.NoClassDefFoundError: hudson/tasks/test/TestObject
    at org.jenkinsci.plugins.ghprb.manager.factory.GhprbBuildManagerFactoryUtil.getBuildManager(GhprbBuildManagerFactoryUtil.java:37)
    at org.jenkinsci.plugins.ghprb.extensions.status.GhprbSimpleStatus.onBuildComplete(GhprbSimpleStatus.java:155)
    at org.jenkinsci.plugins.ghprb.GhprbBuilds.onCompleted(GhprbBuilds.java:145)
    at org.jenkinsci.plugins.ghprb.GhprbBuildListener.onCompleted(GhprbBuildListener.java:27)
    at org.jenkinsci.plugins.ghprb.GhprbBuildListener.onCompleted(GhprbBuildListener.java:12)
    at hudson.model.listeners.RunListener.fireCompleted(RunListener.java:201)
    at hudson.model.Run.execute(Run.java:1786)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:98)
    at hudson.model.Executor.run(Executor.java:381)
Caused by: java.lang.ClassNotFoundException: hudson.tasks.test.TestObject
    at jenkins.util.AntClassLoader.findClassInComponents(AntClassLoader.java:1376)
    at jenkins.util.AntClassLoader.findClass(AntClassLoader.java:1326)
    at jenkins.util.AntClassLoader.loadClass(AntClassLoader.java:1079)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 10 more
techman83 commented 9 years ago

Ok Jenkins is working now. Auto spawning docker containers as required and updating statuses on github.

@hakan42 there were a number of other jobs that were configured to run periodically, do you know what they were doing?

hakan42 commented 9 years ago

The periodic ones were checking sarbians jenkins and github for pre-release versions. Basically launching netkan with branches of NetKAN-dev repo.

techman83 commented 9 years ago

@hakan42 - Thanks. Are these something that need to be re-configured or are they not widely used? You should still have access to the new Jenkins.

hakan42 commented 9 years ago

Let me check during he weekend whether I can re-create them on the new Jenkins. I'll also coordinate with sarbian if and when we need support from his side as well.

techman83 commented 9 years ago

Alright thanks. I'll be about sporadically if things need changing on the backend.

sarbian commented 9 years ago

So, do you/I have to do something to bring the MJ dev repo back ? Currently it is not reporting the last version (and my jenkins report broken build since the build fail when trying to contact your old jenkins)

hakan42 commented 9 years ago

Sorry, did not get around to do anything here, unexpected amounts of Real Life making hacking impossible :)

I'll try my hand now this weekend, I guess I have to give @sarbian a new url to notify as well.

hakan42 commented 9 years ago

@sarbian , to make your MechJeb-Dev build work again, could you please replace the host name in the "HTTP Request" build step from http://ci.ksp-ckan.org/ to http://52.10.41.70/ ? You can test the example I have at https://ksp.sarbian.com/jenkins/job/Hakan-CaptainsLog/

In the meantime, I will take care that the other receiving jobs on the new Jenkins are properly set up.

techman83 commented 9 years ago

@hakan42 - I've not stumbled across the magic that makes '#rebuild' do a thing. Can you point me the direction of this option/plugin that triggers that?

hakan42 commented 9 years ago

@techman83 sorry, no real clue. Could you check whether there is a push notification from github coming in if you add a comment (any comment, actually) to a PR?

techman83 commented 9 years ago

@hakan42 - It's definitely getting comments + pr notifications.

Aug 04, 2015 7:55:31 PM org.jenkinsci.plugins.ghprb.GhprbWebHook handleWebHook
INFO: Got payload event: issue_comment

@Postremus - I vaguely recall you mentioning Jenkins experience.

Postremus commented 9 years ago

Will take a look this evening.

----- Ursprüngliche Nachricht ----- Von: "Leon Wright" notifications@github.com Gesendet: ‎05.‎08.‎2015 05:40 An: "KSP-CKAN/CKAN" CKAN@noreply.github.com Cc: "Martin Panzer" postremus1996@googlemail.com Betreff: Re: [CKAN] CKAN CI - Jenkins (#1247)

@hakan42 - It's definitely getting comments + pr notifications. Aug 04, 2015 7:55:31 PM org.jenkinsci.plugins.ghprb.GhprbWebHook handleWebHook INFO: Got payload event: issue_comment @Postremus - I vaguely recall you mentioning Jenkins experience. — Reply to this email directly or view it on GitHub.

techman83 commented 9 years ago

Docker backup done via following script

#!/bin/bash

IMAGE="jenkins-1"
ID=`docker images|grep latest |awk '{ print $3 }'`

if [ -z `aws s3 ls s3://ckan-ci/docker/${ID}.tgz|awk '{ print $4 }'` ]; then
    echo "Backup doesn't exist"
    echo "Backing up $IMAGE:$ID"
    docker save $IMAGE | gzip -c > /tmp/${ID}.tgz
    aws s3 cp /tmp/${ID}.tgz s3://ckan-ci/docker/${ID}.tgz
    (cd /tmp && rm ${ID}.tgz)
else
    echo "Image already uploaed"
fi

Jenkins configured to use thinbackup, 7 days of rolling backups Sync'd to S3 daily

# backup jenkins to S3
00 3    * * *   root    aws s3 sync /var/ksp-ckan/jenkins/ s3://ckan-ci/jenkins --quiet

Only minor changes to the developer doco, biggest thing is that the docker slaves are spawned on demand so they don't need re-starting manually and the host shouldn't run out of storage as the backups are set to be removed when older than 7 days.

politas commented 8 years ago

Has this been all sorted, now?

techman83 commented 8 years ago

Yeah, it's pretty solid. A human broke it earlier. 😏 @techman83

HebaruSan commented 4 years ago

Has this been all sorted, now?

Yeah, it's pretty solid.

Sorted means finished. :checkered_flag: