dd010101 / vyos-jenkins

Instructions on how to build your own vyos package mirror for stable branches (1.3 equuleus/1.4 sagitta) with Jenkins (for ISO build)
52 stars 15 forks source link

Stage "amd64" skipped due to when conditional #8

Closed GurliGebis closed 1 month ago

GurliGebis commented 1 month ago

I'm trying to do the setup, as part of making the install scripts.

Every build seems to skip amd64, with the message Stage "amd64" skipped due to when conditional

The Build-In Node is configured with the labels: Docker ec2_amd64 docker The System environment vars are:

ARM64_BUILD_DISABLED=true
CUSTOM_BUILD_CHECK_DISABLED=true
DEV_PACKAGES_VYOS_NET_HOST=jenkins@172.17.17.17

Here is the console output for building vyos-1x for the sagitta branch:

Branch indexing
 > git rev-parse --resolve-git-dir /var/lib/jenkins/caches/git-a63153b0a42b565fb090841c8b0b4cd0/.git # timeout=10
Setting origin to https://github.com/vyos/vyos-1x.git
 > git config remote.origin.url https://github.com/vyos/vyos-1x.git # timeout=10
Fetching origin...
Fetching upstream changes from origin
 > git --version # timeout=10
 > git --version # 'git version 2.39.2'
 > git config --get remote.origin.url # timeout=10
 > git fetch --tags --force --progress -- origin +refs/heads/*:refs/remotes/origin/* # timeout=10
Seen branch in repository origin/crux
Seen branch in repository origin/current
Seen branch in repository origin/equuleus
Seen branch in repository origin/feature/T6399-workflows-codeowner-sagitta
Seen branch in repository origin/feature/T6399-workflows-codeowners-equuleus
Seen branch in repository origin/mergify/bp/sagitta/pr-3532
Seen branch in repository origin/sagitta
Seen 7 remote branches
Obtained Jenkinsfile from 0bada0f998c551f1b53686de3e93a6de8fd84d37
Loading library vyos-build@current
Attempting to resolve current from remote references...
 > git --version # timeout=10
 > git --version # 'git version 2.39.2'
 > git ls-remote -h -- https://github.com/dd010101/vyos-build.git # timeout=10
Found match: refs/heads/current revision 4e1cbb2fc29a4050c27fd4c19d287a13b28b3259
The recommended git tool is: git
No credentials specified
Cloning the remote Git repository
Cloning with configured refspecs honoured and without tags
Cloning repository https://github.com/dd010101/vyos-build.git
 > git init /var/lib/jenkins/workspace/vyos-1x_sagitta@libs/28aa1d0271a549162013077de4e453731169b3c81b3abd6a4e0ca4a64986f3e8 # timeout=10
Fetching upstream changes from https://github.com/dd010101/vyos-build.git
 > git --version # timeout=10
 > git --version # 'git version 2.39.2'
 > git fetch --no-tags --force --progress -- https://github.com/dd010101/vyos-build.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/dd010101/vyos-build.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
Avoid second fetch
Checking out Revision 4e1cbb2fc29a4050c27fd4c19d287a13b28b3259 (current)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4e1cbb2fc29a4050c27fd4c19d287a13b28b3259 # timeout=10
Commit message: "Merge branch vyos:current into current"
First time build. Skipping changelog.
[Pipeline] Start of Pipeline
[Pipeline] timeout
Timeout set to expire in 4 hr 0 min
[Pipeline] {
[Pipeline] timestamps
[Pipeline] {
[Pipeline] stage
[Pipeline] { (Define Agent)
[Pipeline] node
22:24:59  Still waiting to schedule task
22:24:59  Waiting for next available executor
22:29:33  Running on [Jenkins](http://172.16.15.96:8080/computer/(built-in)/) in /var/lib/jenkins/workspace/vyos-1x_sagitta
[Pipeline] {
[Pipeline] checkout
22:29:35  The recommended git tool is: git
22:29:35  No credentials specified
22:29:35  Cloning the remote Git repository
22:29:35  Cloning repository https://github.com/vyos/vyos-1x.git
22:29:35   > git init /var/lib/jenkins/workspace/vyos-1x_sagitta # timeout=10
22:29:35  Fetching upstream changes from https://github.com/vyos/vyos-1x.git
22:29:35   > git --version # timeout=10
22:29:35   > git --version # 'git version 2.39.2'
22:29:35   > git fetch --tags --force --progress -- https://github.com/vyos/vyos-1x.git +refs/heads/*:refs/remotes/origin/* # timeout=10
22:29:39   > git config remote.origin.url https://github.com/vyos/vyos-1x.git # timeout=10
22:29:39   > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
22:29:39  Avoid second fetch
22:29:39  Checking out Revision 0bada0f998c551f1b53686de3e93a6de8fd84d37 (sagitta)
22:29:39   > git config core.sparsecheckout # timeout=10
22:29:39   > git checkout -f 0bada0f998c551f1b53686de3e93a6de8fd84d37 # timeout=10
22:29:39  Commit message: "Merge pull request #3544 from vyos/mergify/bp/sagitta/pr-3541"
22:29:39  First time build. Skipping changelog.
[Pipeline] withEnv
[Pipeline] {
[Pipeline] echo
22:29:42  Warning, empty changelog. Probably because this is the first build.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
Stage "Define Agent" skipped due to when conditional
[Pipeline] getContext
[Pipeline] }
[Pipeline] // stage
[Pipeline] stage
[Pipeline] { (Build Code)
[Pipeline] echo
22:29:51  Warning, empty changelog. Probably because this is the first build.
Stage "Build Code" skipped due to when conditional
[Pipeline] getContext
[Pipeline] parallel
[Pipeline] { (Branch: amd64)
[Pipeline] { (Branch: arm64)
[Pipeline] stage
[Pipeline] { (amd64)
[Pipeline] stage
[Pipeline] { (arm64)
Stage "amd64" skipped due to when conditional
[Pipeline] getContext
[Pipeline] }
Stage "arm64" skipped due to when conditional
[Pipeline] getContext
[Pipeline] }
[Pipeline] // stage
[Pipeline] // stage
[Pipeline] }
[Pipeline] }
[Pipeline] // parallel
[Pipeline] }
[Pipeline] // stage
[Pipeline] stage
[Pipeline] { (Finalize)
[Pipeline] node
22:30:26  Still waiting to schedule task
22:30:26  Waiting for next available executor
22:36:05  Running on [Jenkins](http://172.16.15.96:8080/computer/(built-in)/) in /var/lib/jenkins/workspace/vyos-1x_sagitta
[Pipeline] {
[Pipeline] checkout
22:36:07  The recommended git tool is: git
22:36:07  No credentials specified
22:36:07   > git rev-parse --resolve-git-dir /var/lib/jenkins/workspace/vyos-1x_sagitta/.git # timeout=10
22:36:07  Fetching changes from the remote Git repository
22:36:07   > git config remote.origin.url https://github.com/vyos/vyos-1x.git # timeout=10
22:36:07  Fetching upstream changes from https://github.com/vyos/vyos-1x.git
22:36:07   > git --version # timeout=10
22:36:07   > git --version # 'git version 2.39.2'
22:36:07   > git fetch --tags --force --progress -- https://github.com/vyos/vyos-1x.git +refs/heads/*:refs/remotes/origin/* # timeout=10
22:36:07  Checking out Revision 0bada0f998c551f1b53686de3e93a6de8fd84d37 (sagitta)
22:36:07   > git config core.sparsecheckout # timeout=10
22:36:07   > git checkout -f 0bada0f998c551f1b53686de3e93a6de8fd84d37 # timeout=10
22:36:07  Commit message: "Merge pull request #3544 from vyos/mergify/bp/sagitta/pr-3541"
[Pipeline] withEnv
[Pipeline] {
[Pipeline] echo
22:36:09  Warning, empty changelog. Probably because this is the first build.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
Stage "Finalize" skipped due to when conditional
[Pipeline] getContext
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] End of Pipeline
Finished: SUCCESS
GurliGebis commented 1 month ago

Seems like this happens when the jobs are being seeded for the first time. Triggering the seed-jobs.sh build script seems to get them to build.

Is this to be expected?

dd010101 commented 1 month ago

Right the first line gives you hint what is going on Branch indexing.

The build you seeing isn't the actual build. Multibranch Pipeline work by discovering branches from repository, when the job is first defined Jenkins has no idea what branches repository has thus it triggers build that does only Branch indexing and automatically skips all steps. This is required since without branch information the Jenkins doesn't even know what to build.

That's why it's not possible to build jobs right away when you define them since there isn't anything to build yet. After first branch indexing is completed then you can trigger actual build of what Jenkins discovered.

I don't like how Jenkins presents branch indexing as build since that's just confusing but that's what it does. When you looking at the build in the Jenkins it says it's branch indexing so you need to check what you actually looking at.

GurliGebis commented 1 month ago

Indeed you are right ๐Ÿ˜ƒ Maybe I shouldn't mess around with those things when I'm tired ๐Ÿ˜ƒ

Thanks.

Btw. do you know if env vars and stuff can be configured using the REST API? That way, we can automated even more of the initial setup.

dd010101 commented 1 month ago

The documentation for the Jenkins REST API is kind of hard to come by. I started just by searching examples of other people who did make it work, but that doesn't necessarily get you what you want specifically - it gives you idea how to do it generally. There is generic information on how API works - https://www.jenkins.io/doc/book/using/remote-access-api/.

But there is no public reference, where you would see all the endpoints or ways you can interact with them. Instead what you should use is Jenkins itself. If you go to your Jenkins and add /api/ on the end of URL where you are, then you get some information, this works only somewhere - generally it works on pages that represent some entity - like job, node, ...

For example you would be interested in the built-in node, so you navigate in Jenkins to built-in node API page http://JENKINS_HOST:8080/manage/computer/(built-in)/api/xml.

I didn't find a example or way how to change the system configuration though. Maybe it's somewhere but I didn't find it.

Perhaps for these kind of tasks the REST API isn't the right tool...

I did find the system configuration in internal files in /var/lib/jenkins: /var/lib/jenkins/config.xml (general configuration) /var/lib/jenkins/credentials.xml (ssh key) /var/lib/jenkins/org.jenkinsci.plugins.workflow.libs.GlobalLibraries.xml (pipeline library) and there are others...

Thus another perhaps easier way would be to shutdown Jenkins and run command similar to this one to modify those .xml as needed. Then you can start Jenkins already configured. I wouldn't just replace those files with some examples, that seems like bad idea - it could break things (due to different environment/version), thus I would suggest making copy, trying to modify the copy and if that works then to put the modified copy back in /var/lib/jenkins. It would be nice to leave the original somewhere for example as /var/lib/jenkins/config.xml.orig if the process goes wrong.

dd010101 commented 1 month ago

I did try to manually modify /var/lib/jenkins/config.xml by adding another Environment variable the way as describe above and it works, what I added Jenkins treats as its own. The format is bit funny - like for example:

<tree-map>
  <default>
    <comparator class="java.lang.String$CaseInsensitiveComparator" reference="../../../../../../views/listView/jobNames/comparator"/>
  </default>
  <int>3</int>
  <string>ARM64_BUILD_DISABLED</string>
  <string>true</string>
  <string>CUSTOM_BUILD_CHECK_DISABLED</string>
  <string>true</string>
  <string>DEV_PACKAGES_VYOS_NET_HOST</string>
  <string>jenkins@172.17.17.17</string>
</tree-map>

There is no structure for variables instead there is counter and series of <string> elements that correspond to twice the number of the counter... Whatever, weird but it works, you add another variable by adding two <string> elements and incrementing the counter by 1. Most of the settings are represented as proper structure though.

GurliGebis commented 1 month ago

Not sure I am a fan of that, since the risk of breaking it is too big.

I got my setup running with all packages building ๐Ÿ™‚ Now I just need to script the last part, and then test it all on a clean vm

dd010101 commented 1 month ago

Jenkins has also CLI client - https://www.jenkins.io/doc/book/managing/cli/ - for the remote control.

You get get it from your Jenkins:

wget http://172.17.17.17:8080/jnlpJars/jenkins-cli.jar

Then use it with the usual user/token as you would the REST API:

java -jar jenkins-cli.jar -http -s http://172.17.17.17:8080 -auth "$JENKINS_USER:$JENKINS_TOKEN" help

This is what I get for available actions:

  add-job-to-view
    Adds jobs to view.
  build
    Builds a job, and optionally waits until its completion.
  cancel-quiet-down
    Cancel the effect of the "quiet-down" command.
  clear-queue
    Clears the build queue.
  connect-node
    Reconnect to a node(s)
  console
    Retrieves console output of a build.
  copy-job
    Copies a job.
  create-credentials-by-xml
    Create Credential by XML
  create-credentials-domain-by-xml
    Create Credentials Domain by XML
  create-job
    Creates a new job by reading stdin as a configuration XML file.
  create-node
    Creates a new node by reading stdin as a XML configuration.
  create-view
    Creates a new view by reading stdin as a XML configuration.
  declarative-linter
    Validate a Jenkinsfile containing a Declarative Pipeline
  delete-builds
    Deletes build record(s).
  delete-credentials
    Delete a Credential
  delete-credentials-domain
    Delete a Credentials Domain
  delete-job
    Deletes job(s).
  delete-node
    Deletes node(s)
  delete-view
    Deletes view(s).
  disable-job
    Disables a job.
  disable-plugin
    Disable one or more installed plugins.
  disconnect-node
    Disconnects from a node.
  enable-job
    Enables a job.
  enable-plugin
    Enables one or more installed plugins transitively.
  get-credentials-as-xml
    Get a Credentials as XML (secrets redacted)
  get-credentials-domain-as-xml
    Get a Credentials Domain as XML
  get-gradle
    List available gradle installations
  get-job
    Dumps the job definition XML to stdout.
  get-node
    Dumps the node definition XML to stdout.
  get-view
    Dumps the view definition XML to stdout.
  groovy
    Executes the specified Groovy script.
  groovysh
    Runs an interactive groovy shell.
  help
    Lists all the available commands or a detailed description of single command.
  import-credentials-as-xml
    Import credentials as XML. The output of "list-credentials-as-xml" can be used as input here as is, the only needed change is to set the actual Secrets which are redacted in the output.
  install-plugin
    Installs a plugin either from a file, an URL, or from update center.
  keep-build
    Mark the build to keep the build forever.
  list-changes
    Dumps the changelog for the specified build(s).
  list-credentials
    Lists the Credentials in a specific Store
  list-credentials-as-xml
    Export credentials as XML. The output of this command can be used as input for "import-credentials-as-xml" as is, the only needed change is to set the actual Secrets which are redacted in the output.
  list-credentials-context-resolvers
    List Credentials Context Resolvers
  list-credentials-providers
    List Credentials Providers
  list-jobs
    Lists all jobs in a specific view or item group.
  list-plugins
    Outputs a list of installed plugins.
  mail
    Reads stdin and sends that out as an e-mail.
  offline-node
    Stop using a node for performing builds temporarily, until the next "online-node" command.
  online-node
    Resume using a node for performing builds, to cancel out the earlier "offline-node" command.
  quiet-down
    Quiet down Jenkins, in preparation for a restart. Donโ€™t start any builds.
  reload-configuration
    Discard all the loaded data in memory and reload everything from file system. Useful when you modified config files directly on disk.
  reload-job
    Reload job(s)
  remove-job-from-view
    Removes jobs from view.
  replay-pipeline
    Replay a Pipeline build with edited script taken from standard input
  restart
    Restart Jenkins.
  restart-from-stage
    Restart a completed Declarative Pipeline build from a given stage.
  safe-restart
    Safe Restart Jenkins. Donโ€™t start any builds.
  safe-shutdown
    Puts Jenkins into the quiet mode, wait for existing builds to be completed, and then shut down Jenkins.
  session-id
    Outputs the session ID, which changes every time Jenkins restarts.
  set-build-description
    Sets the description of a build.
  set-build-display-name
    Sets the displayName of a build.
  shutdown
    Immediately shuts down Jenkins server.
  stop-builds
    Stop all running builds for job(s)
  update-credentials-by-xml
    Update Credentials by XML
  update-credentials-domain-by-xml
    Update Credentials Domain by XML
  update-job
    Updates the job definition XML from stdin. The opposite of the get-job command.
  update-node
    Updates the node definition XML from stdin. The opposite of the get-node command.
  update-view
    Updates the view definition XML from stdin. The opposite of the get-view command.
  version
    Outputs the current version.
  wait-node-offline
    Wait for a node to become offline.
  wait-node-online
    Wait for a node to become online.
  who-am-i
    Reports your credential and permissions.

install-plugin is interesting, there is example how to use it https://gist.github.com/basmussen/8182784

I don't see anything in the CLI commands that would be related to system configuration, environment variables, pipeline plugins, ...

  reload-configuration
    Discard all the loaded data in memory and reload everything from file system. Useful when you modified config files directly on disk.

What? Modifying those internal files is the official way? ๐Ÿ˜ฎ

If you compare what is in

/var/lib/jenkins/jobs/dropbear/config.xml

and what REST API or CLI returns:

java -jar jenkins-cli.jar -http -s http://172.17.17.17:8080 -auth "$JENKINS_USER:$JENKINS_TOKEN" get-job dropbear

It's the same config.xml! ๐Ÿ˜„

Jenkins has 3 different methods of remote control: CLI with JAR CLI over SSH (Jenkins creates it's own internal SSH server) REST API

CLI over JAR is supposed to be the easiest way.

Yet it seems like all of those 3 options are the same thing. The API/CLI gives you nicer interface for some actions (like build) but if you want to create job for example then you need to upload the XML in the format what is on the disk - so you just pipe config.xml it into create-job/update-job:

java -jar jenkins-cli.jar create-job NAME
Creates a new job by reading stdin as a configuration XML file.
 NAME : Name of the job to create

That's what the REST API does as well. We can't escape the XML what is on the disk it seems...

GurliGebis commented 1 month ago

I can try and install a clean Jenkins, save the config, make the changes and then diff the config files - maybe the changes can be done in a safe way.

It's worth a try at least ๐Ÿ˜Š If nothing else - having a well written guide for making the changes is also okay.

dd010101 commented 1 month ago

You don't necessarily need to create diff. I would suggest to read the XML of already configured Jenkins and you will understand where is what and what is what by the values you see. You iterate until you find all the values you want to change as in the manual guide. Then you have collection of pieces of XML you want to insert or modify in specific XML files.

If you do manipulation of the XML by proper tools, like xmlstarlet, then you reduce the chance of malformed result. There is always chance that future Jenkins will differ so much that modification would result in broken configuration. That's where the manipulation script should make backup of every file it modified so there is easy way just to call some kind of revert command and that would restore the original files as in state before before modification - now the user can follow manual guide as workaround.

Plugins require installation - it can be done via CLI JAR or REST API.

I think this way it should be possible to automate the Jenkins configuration in relatively safe manner with fallback to manual way if something goes wrong.

GurliGebis commented 1 month ago

Okay ๐Ÿ˜ŠI'll see what I can do, when I get time to look at it again this weekend (hopefully)

GurliGebis commented 1 month ago

A small update - so far I have everything automated - only missing the import of the projects and triggering them. And then a complete test on a clean Debian machine.

dd010101 commented 1 month ago

Looks good, you done a lot of work!

The seed of jobs and first build requires to wait before Jenkins finishes branch discovery - do you plan to automate this as well? There would need to be some spinning loop and check if Jenkins has still unfinished runs? I know that if you trigger build immediately then Jenkins runs fail because jobs don't know branch yet. It would be nice to just queue the jobs instead of waiting and then triggering - that doesn't seem to be possible though? Since if you have enough executors then they will try to build before another executor finished discovery I think.

My original script did have build right after job create and that didn't work. Maybe just some static sleep would work? Like 10 seconds to let the branch runs initialize and then maybe Jenkins would queue the actual builds? I didn't try that.

dd010101 commented 1 month ago

I added another job that needs additional Jenkins configuration. Additional jobs/packages are automatically handled by the seed-jobs.sh/jobs.json script but this one requires additional Jenkins environment variable for configuration. See Configure environment variables and add global vyos-build Jenkins library for CUSTOM_DOCKER_REPO.

GurliGebis commented 1 month ago

I plan on that yes. Also, I need to find a way to not schedule everything at once, since they timeout, if everything is queued at once for the first build.

dd010101 commented 1 month ago

What kind of timeouts you see? Like the runs expire before they run? There is also option the run expires in middle of it's steps. I know there is 240 minutes specified for job to complete build - is this what causes the issue or another timeout?

I didn't see such issue. I'm using 8 executors and fairly performant system though because and didn't want to wait extra for testing purposes - that's perhaps why.

It would be good to handle this on the Jenkins side - so it just waits like day? There is also possibility to send 10 jobs, wait for free queue, then send another 10 but we don't know if 10 is too much - there is also option to send just one by one but that is very slow. The script would need to wait in this case and that's not good for users - it would be better if the script did exit and Jenkins did it in background. Running the script itself in background is not great since then there isn't clear feedback.

GurliGebis commented 1 month ago

They expire before the 120 minutes (I think it is), due to there being only a few build agents, and building everything at once makes the jobs wait for too long (since they all complete one step, then go back in the queue for the next step).

Anyway, I just tried building the container images - it fails when it tries to branch index them:

Branch indexing
 > git rev-parse --resolve-git-dir /var/lib/jenkins/caches/git-014f40594d2ff9ee882fdf7dc8a7fdfe/.git # timeout=10
Setting origin to https://github.com/dd010101/vyos-missing.git
 > git config remote.origin.url https://github.com/dd010101/vyos-missing.git # timeout=10
Fetching origin...
Fetching upstream changes from origin
 > git --version # timeout=10
 > git --version # 'git version 2.39.2'
 > git config --get remote.origin.url # timeout=10
 > git fetch --tags --force --progress -- origin +refs/heads/*:refs/remotes/origin/* # timeout=10
Seen branch in repository origin/current
Seen branch in repository origin/equuleus
Seen branch in repository origin/sagitta
Seen 3 remote branches
Obtained packages/vyos-build-container/Jenkinsfile from 71d739ce597a77e8d2a4be3022f08f36bb7a4ab4
ERROR: Could not find any definition of libraries [vyos-build@sagitta]
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
WorkflowScript: Loading libraries failed

1 error

    at org.codehaus.groovy.control.ErrorCollector.failIfErrors(ErrorCollector.java:309)
    at org.codehaus.groovy.control.CompilationUnit.applyToPrimaryClassNodes(CompilationUnit.java:1107)
    at org.codehaus.groovy.control.CompilationUnit.doPhaseOperation(CompilationUnit.java:624)
    at org.codehaus.groovy.control.CompilationUnit.processPhaseOperations(CompilationUnit.java:602)
    at org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:579)
    at groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:323)
    at groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:293)
    at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox$Scope.parse(GroovySandbox.java:163)
    at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.doParse(CpsGroovyShell.java:190)
    at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.reparse(CpsGroovyShell.java:175)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.parseScript(CpsFlowExecution.java:635)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.start(CpsFlowExecution.java:581)
    at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:335)
    at hudson.model.ResourceController.execute(ResourceController.java:101)
    at hudson.model.Executor.run(Executor.java:442)
Finished: FAILURE
GurliGebis commented 1 month ago

(I did add the needed env var, but that doesn't seem to have anything to do with it).

GurliGebis commented 1 month ago

My guess would be, that it clones the vyos-missing repo, but the vyos-build library is defined in the vyos-build repo.

dd010101 commented 1 month ago

Do you have vyos-build Jenkins pipeline library configured in System settings? The env is used later - your run fails before anything runs. Do other sagitta jobs run fine? The library is used by all sagitta jobs.

Started by user [root](http://:8080/user/root)
 > git rev-parse --resolve-git-dir /var/lib/jenkins/caches/git-014f40594d2ff9ee882fdf7dc8a7fdfe/.git # timeout=10
Setting origin to https://github.com/dd010101/vyos-missing.git
 > git config remote.origin.url https://github.com/dd010101/vyos-missing.git # timeout=10
Fetching origin...
Fetching upstream changes from origin
 > git --version # timeout=10
 > git --version # 'git version 2.39.2'
 > git config --get remote.origin.url # timeout=10
 > git fetch --tags --force --progress -- origin +refs/heads/*:refs/remotes/origin/* # timeout=10
Seen branch in repository origin/current
Seen branch in repository origin/equuleus
Seen branch in repository origin/sagitta
Seen 3 remote branches
Obtained packages/vyos-build-container/Jenkinsfile from 7e01b3ff48cc8fc69dc1eb0522afedecadf13d70
Loading library vyos-build@sagitta
Attempting to resolve sagitta from remote references...
 > git --version # timeout=10
 > git --version # 'git version 2.39.2'
 > git ls-remote -h -- https://github.com/dd010101/vyos-build.git # timeout=10
Found match: refs/heads/sagitta revision f52e36e619f32e5c7cc536b9398687c8061cb269

Here is snippet from successful run.

GurliGebis commented 1 month ago

I will take a look once I get back to me machine later tonight. But most likely not ๐Ÿ™‚ I plan on setting up that job first, and once it has build the images, it will provision the other jobs.

dd010101 commented 1 month ago

It's described in Configure environment variables and add global vyos-build Jenkins library - Global Pipeline Libraries -> Add. I would suggest not to bother with this job from the start - running the first build in bash is easier and gives more feedback. Then the job can do maintenance.

dd010101 commented 1 month ago

They expire before the 120 minutes (I think it is), due to there being only a few build agents, and building everything at once makes the jobs wait for too long (since they all complete one step, then go back in the queue for the next step).

Here is the 240 minute timeout and I don't see any other. It would be great if you posted the build log of job that does fail because of timeout to see if there are clues to what timeout it is. It's annoying how jobs do first step and then pause before all jobs do first step but that's how the scripts are written.

GurliGebis commented 1 month ago

Having Jenkins build the container images is actually a simplification of the setup ๐Ÿ™‚ Waiting for that build is easy, just sleep a while and check if the images are there yet. I will try with the clean setup once I get that far, and if it fails, I will post some logs.

I will be back later with more info (or tomorrow, depending on how long time it takes)

dd010101 commented 1 month ago

Waiting for that build is easy, just sleep a while and check if the images are there yet.

Like docker images | grep? ๐Ÿค” Using the existing bash script would be easier - but you like to do it better! ๐Ÿ‘

BTW: curl -Ss -g --fail-with-body "$JENKINS_URL/computer/api/json" | jq .computer[0].idle gives "true" when all is done and "false" when something is running - could come handy if you didn't look at it yet.

GurliGebis commented 1 month ago

nice, I'll keep that one in mind.

Btw. you might want to call docker buildx prune -f when you remove the old images. Otherwise it finishes instantly, using the buildx cache (so no new image is ever built).

GurliGebis commented 1 month ago

I got it working btw. - I had forgotten to set the library in my install script - now it does that as well.

dd010101 commented 1 month ago

Btw. you might want to call docker buildx prune -f when you remove the old images.

Very good point.

If docker doesn't build image every time that isn't a problem in itself. Docker checks each step, builds only changed parts, combines the pieces and then calculates hash of resulting image (layer) - in your case image (layer) already exists with the same hash thus there is no need to create another one since it would be identical copy anyway - that's expected behavior.

The cache will cause issue eventually though. Since steps like apt install something or wget example.com/something-latest.tar.gz don't change in the Dockerfile for long time but their result will change at some point - thus if we use cache long enough - we will have such parts more a more out of date as time goes. Docker doesn't have ability to disable cache for such variable steps and the cache doesn't expire either. That's why we can't use cache in long term build - it's good when you running builds after each other for development purposes - then cache is great, but if we want to build image in the future then we can't use cache.

GurliGebis commented 1 month ago

Yep ๐Ÿ™‚

I think I spotted an issue. First we remove the old image, and then we build the new one.

If the build fails for some reason, we are left with no image, and all the other projects will be unable to build until it is fixed.

Would it be better to store the current image id in e variable, build the image, and if that goes ok, it removes the old image. That way, we never remove the only image there is, and things keep working. This also ensure other projects can build in parallel with this project.

dd010101 commented 1 month ago

There is another error I made. The vyos-docker-container can't be in vyos-missing repository - then Jenkins watches if the script in vyos-docker-container changed and that's pointless, we need to watch vyos-build/docker instead.

Simple script with so many mistakes!

GurliGebis commented 1 month ago

But it keeps getting better with every bugfix ๐Ÿ™‚

dd010101 commented 1 month ago

Location and the cleanup after successful build&push should be fixed now. The vyos-build-container now lives in https://github.com/dd010101/vyos-build.git - rest stays the same, jobs.json is updated.

GurliGebis commented 1 month ago

Nice, I'll update my setup and try it again later today ๐Ÿ™‚

GurliGebis commented 1 month ago

Status for today - I have automated the setup of the vyos-build-container job, and know how to make it wait for it, and print the status - but that's a job for sometime during the weekend.

dd010101 commented 1 month ago

I will close this one since we have the #20 issue to continue in.