Open boazetsec opened 2 years ago
Thanks for submitting. Where are you fetching the libraries from? (GitHub, GitLab, etc) if IOPs are looking good then the only place this could be happening is the network.
I wonder if many concurrent requests to clone the same repository are resulting in rate limiting.
JTE uses Jenkins APIs to fetch from the remote repository so there's not a ton that can be done here unless JTE implements some custom caching.
An alternative approach that might help would be to package your libraries into a stand-alone Jenkins plugin so that you're not pulling from the remote repository every time.
A release of the Gradle JTE Plugin is pending
@steven-terrana Thanks for the reply, regarding IOPS and throughput we are pretty safe and utilizing only small percentage of it. We thought it might be related to rate limit as you've suggested (we are using bitbucket cloud), if we will hit the rate limit is there any indication for that? is there a retry mechanism implemented? Also, is it one api call for fetching the git repo of the JTE library per a job? if so I don't think it will be cause us to reach the rate limit in this case.
As for the network, we didn't see anything maxing out there but once it happens again I'll double check.
I am also facing the similar kind of issue where libraries and configuration load is taking around 2 mins for a single job. I am using Bitbucket cloud and loading around 3 configurations and 10 libraries. Any suggestion ?
I also faced this issue. Occasionally the load part takes 10-20 mins.
Hi @steven-terrana , Thanks for mentioning the gradle plugin. This gradle JTE plugin looks good but it will require a Jenkins restart whenever we will upgrade the libraries because we will have to update the plugin. It's a good option if your libraries have been settled but may not be preferrable if you are still maturing the libraries to meet the demands of different projects. So as a different option, can we make templating engine plugin downloading all the libraries together somehow in one go from remote repository instead of making multiple remote calls ? Just throwing an idea.
I'm getting same issue, when start build with 1000 service concurrently, action clone libraries JTE very slow and get me stuck from 30 - 60m, it is so crazy, take over time from me. And when that happening, i see many step fetch, clone repo from SCM (gitlab self host)
git fetch --tags --force --progress --prune -- origin +refs/heads/stable:refs/remotes/origin/stable
/bin/sh /mnt/data-0/jenkins/caches/git-xxxxxxxxxxxxxxxx@tmp/jenkins-gitclient-xxxxxxx.sh-copy -o SendEnv=GIT_PROTOCOL git@gitlab.example.com git-upload-pack 'xxxxx/template-jenkins-xxxx.git'
ssh -i /mnt/data-0/jenkins/caches/git-xxxxxxxxxxxxxxxx@tmp/jenkins-gitclient-xxxxxxxx.key -l git -o StrictHostKeyChecking=accept-new -o HashKnownHosts=yes -o SendEnv=GIT_PROTOCOL git@gitlab.example.com git-upload-pack 'xxxxx/template-jenkins-xxxx.git'
Any idea can help me?
Same issue here.
Jenkins Version
Jenkins 2.346.1
JTE Version
2.4
Bug Description
We're experiencing the same issue as well, we've noticed this is happening to us mainly when multiple (10-20 jobs) JTE jobs are running at the same time, and when the jobs are executed they are getting stuck in the JTE library load part for around 10m-15m which is a major pain. I'm trying to understand where is the bottleneck that is causing this but haven't found it yet, the disk utilization seems normal re IOPS / throughput, is there some common resource that is being used on the Jenkins server by all the jobs and they are waiting for each other to finish? I'd appreciate the help as we are neck deep in the implementation with JTE and this is becoming a serious issue for us, thanks!
Relevant log output
No response
Steps to Reproduce
Execute multiple JTE jobs at the same time