Closed MarkEWaite closed 1 year ago
Nodejs dist can’t be made private as it will break builds as far as I know.
npm is likely the same depending on if it’s caching the npm binary or dependencies, (binary is the one that would cause problems)
Nodejs dist can’t be made private as it will break builds as far as I know.
npm is likely the same depending on if it’s caching the npm binary or dependencies, (binary is the one that would cause problems)
What are the usages of NPM/NodeJS for artifactory? It's not really known by infra team
Every CI build for a plugin that uses node / npm and core will download it from the mirror: https://github.com/jenkinsci/plugin-pom/blob/d749834a565493e17df4598e4d394749ff51dd00/pom.xml#L131-L132
Every CI build for a plugin that uses node / npm and core will download it from the mirror: https://github.com/jenkinsci/plugin-pom/blob/d749834a565493e17df4598e4d394749ff51dd00/pom.xml#L131-L132
Thanks for the explanation @timja !
As pointed by @lemeurherve, we should check to put this usage under ACP, like for Maven, to decrease the amount of data downloaded from JFrog
Timing proposal discussed during the weekly meeting:
jgit
brownout: Friday 2 June maven-repo1
brownout: 5 or 6 JuneUpdate: moved the brownout tests.
Proposal (ping @MarkEWaite @smerle33 @lemeurherve for voting +1 or -1 to this message)
jgit
Thursday 8 June at 12h30 UTC (@MarkEWaite we can do it as we have a 1:1 at this time)maven-repo1
Monday 12 June 12h30 if (and only if) the jgit
test is successful.If we got majority of vote, I'll open a status page + will send an email to the developers
+1 if needed as a voting and not an emoji
Thanks folks! I need review (and approval if ok) on https://github.com/jenkins-infra/status/pull/310 then, before I send the email
+1 from me
email thread for jgit brownout: https://groups.google.com/g/jenkinsci-dev
Update:
@jtnord suggested an alternate approach that can reduce bandwidth use without requiring changes to new releases of pom files and updates to those pom files. The proposal does not implement all the requested changes, but it implements the most significant bandwidth reductions.
The Maven root pom defines Maven central as a repository. We cache Maven central on https;//repo.jenkins-ci.org under the virtual repository https;//repo.jenkins-ci.org/public . If we password protect that cached copy of Maven central, then Jenkins core and Jenkins plugin builds will automatically fall back to request the artifacts from Maven central directly.
@jtnord noted that we can approximate a password protected Maven central cache with the mirrorsOf
configuration in the settings.xml
file of the current user. Specifically, I've added the following entries to my ~/.m2/settings.xml
and confirmed that I can build Jenkins core and more than 20 Jenkins plugins:
<mirrors>
<mirror>
<id>repo.jenkins-ci.org</id>
<url>https://repo.jenkins-ci.org/releases/</url>
<mirrorOf>!repo1,!central,!incrementals,!jcenter-cache.jenkins-ci.org,!jgit-repository,*</mirrorOf>
</mirror>
</mirrors>
That mirror definition declares that https://repo.jenkins-ci.org/releases is a mirror of all repositories except repositories with the identifiers repo1, central, incrementals, jcenter-cache.jenkins-ci.org, and jgit-repository.
I included a Jenkins profile in my ~/.m2/settings.xml
file to define those repositories:
<profiles>
<profile>
<id>jenkins</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<repositories>
<repository>
<id>repo.jenkins-ci.org</id>
<url>https://repo.jenkins-ci.org/releases/</url>
</repository>
<repository>
<id>jcenter-cache.jenkins-ci.org</id>
<url>https://repo.jenkins-ci.org/jcenter-cache/</url>
</repository>
<repository>
<id>jgit-repository</id>
<url>https://repo.eclipse.org/content/groups/releases/</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>repo.jenkins-ci.org</id>
<url>https://repo.jenkins-ci.org/releases/</url>
</pluginRepository>
</pluginRepositories>
</profile>
</profiles>
When I build Jenkins core and Jenkins plugins, it downloads the Jenkins release artifacts (like jenkins.war) from the releases repository and it downloads artifacts that are not in the Jenkins release artifacts repository from the jcenter-cache repository (for jcrypt 1.0.0), the Eclipse JGit repository (for JGit), or Maven central (the fallback). If we moved jcrypt 1.0.0
from jcenter-cache to the Jenkins releases repository, we could remove it from that list. As far as I can tell, it is the only component pulled from jcenter-cache by Jenkins core or the 20+ plugins that I tested.
The changes that I was testing are broader than what was proposed by @jtnord, because they limit my access to only the releases
repository, not to the other 10+ repositories that are included in the Jenkins public virtual repository.
This needs further discussion after I return from vacation, but it looks quite promising that we might be able to make significant and immediate progress with a single change.
This won't block access to other cached repositories in the Jenkins public virtual repository, but it would block access to most heavily used cache, Maven central.
Following the recommendations of @jtnord and @daniel-beck, we have to plan a 1 hour "brownout" which goal would be to test the removal of the Maven Central mirrored repository from repo.jenkins-ci.org/public (instead of adding user password).
The following steps would be needed:
Before the brownout
central
from the mirrorOf
directive to let Maven directly download artefacts from central when not found in repo.jenkins-ci.org/public - https://github.com/jenkins-infra/jenkins-infra/pull/2992
During the brownout
rm -rf <nginx cahce dir>
on each
no-transfer-progress
option in https://github.com/jenkins-infra/pipeline-library/blob/8ed423572b7ade54219dcc6472a120a82c5c1145/vars/infra.groovy#L164 to allow us diagnosing the artefacts downloads in build logs
central
repository from the public
Virtual RepositoryEnd of brownout
no-transfer-progress
option~Release of a plugin
Manually with MRP, or CD-enabled via GHA, or both?
we will only check the
prep
stage
I think you need a full-test
since PCT runs will be running Maven with the tested plugin’s versioned POM, which may behave differently BOM builds.
Release of a plugin
Manually with MRP, or CD-enabled via GHA, or both?
we will only check the
prep
stageI think you need a
full-test
since PCT runs will be running Maven with the tested plugin’s versioned POM, which may behave differently BOM builds.
=> I've updated my comment above. Thanks!
Outside the brownout plan, here are my notes from yesterday exchange with @jtnord @MarkEWaite @daniel-beck @lemeurherve @smerle33
Reminder by @jtnord that putting a repository list in pom (or parent pom) is discouraged: https://blog.sonatype.com/2009/02/why-putting-repositories-in-your-poms-is-a-bad-idea/
If the brownout shows that builds are not failing and are downloading central artefacts, we'll want to persist this setup.
One of the core issues left is that most of the mirrored repositories should be in central (so downloads would be done through central instead of Jenkins Artifactory)
jgit
is an ideal candidate as it seems to have been included in central: we should be able to plan to remove it from our mirrors (ref. https://search.maven.org/artifact/org.eclipse.jgit/org.eclipse.jgit-parent/6.6.0.202305301015-r/pom)javanet2
IIUC) which upstream is gone. Infra team want to ask JFrog for help in converting these repositories into hosted repositories.Another potential question: we could start using ACP in the GitHub Actions builds (CD process)
Another long term question raised by @jtnord: publish everything to central?
we could start using ACP in the GitHub Actions builds (CD process)
We could, though these should not account for that much load to begin with. Is it worth the bother?
publish everything to central?
FTR OSSRH-93431
You did not do anything yet, did you? https://github.com/jenkinsci/bom/pull/2297/checks?check_run_id=15440825146
Could not transfer artifact com.github.jnr:jnr-x86asm:jar:1.0.2 from/to do-proxy (https://repo.do.jenkins.io/public/): status code: 403, reason phrase: Forbidden (403)
No changes have been applied to artifactory or to the proxies related to the bandwidth reduction project as far as I know. I may have missed a change to the artifact caching proxy that is preparatory for next week's brownout.
@lemeurherve would know for sure
Seems to have a transient issue.
You did not do anything yet, did you? https://github.com/jenkinsci/bom/pull/2297/checks?check_run_id=15440825146
Could not transfer artifact com.github.jnr:jnr-x86asm:jar:1.0.2 from/to do-proxy (https://repo.do.jenkins.io/public/): status code: 403, reason phrase: Forbidden (403)
The job is HTTP/404 in Jenkins as the PR was merged so I can't analyse.
But this message is unexpected: repo.do.jenkins.io
is the ACP instance running inside DigitalOcean Cloud. BOM build are expected to run in AWS which makes me worry a bit.
Do you remember if it was the prep
stage which failed?
Yes, just prep
, since this PR was not full-test
.
Announce done: next step: update settings.xml and prepare ACP cache cleanup script
Update: thanks @lemeurherve for https://github.com/jenkins-infra/kubernetes-management/pull/4217 (which is needed to test https://infra.ci.jenkins.io/job/reports/job/backend-extension-indexer/job/master/55/console
First brownout finished, we'll share feedbacks here
Brownout proceeded as outlined in the earlier comment https://github.com/jenkins-infra/helpdesk/issues/3599#issuecomment-1653881431.
repo-1
caches were removed from the public
artifact repository on repo.jenkins-ci.org. --no-transfer-progress
argument was removed from the buildPlugin
script so that we could see download sources and progress. The plugin bill of materials build failed when it was interrupted by a new commit. The interruption was unrelated to the brownout. There was an unexpected error message that needs more investigation. The message was:
Unexpected messages were also seen in local builds that match the messages in the bom build. Needs more investigation.
Local release of pollscm plugin with mvn release:prepare release:perform
failed unexpectedly for Mark Waite. The hpi
file was delivered to repo.jenkins-ci.org but the pom file was not delivered there. We're not sure if the failure is due to a local configuration problem on Mark Waite's computer or some other issue. The build message was:
Can https://github.com/jenkins-infra/pipeline-library/commit/67463678e9ba6375db0d442890a9a3df9190affe be reverted? It is making build logs quite verbose.
Can jenkins-infra/pipeline-library@6746367 be reverted? It is making build logs quite verbose.
Done by https://github.com/jenkins-infra/pipeline-library/commit/1c849640d0fe9f8ae5095d1b08e8360efee8815d + cleared up the library cache on ci.jenkins.io.
The backend extension indexer job appears to have stopped working about the same time we performed the brownout. We'll need to investigate further to understand the root cause of the failure.
Update: as part of https://github.com/jenkins-infra/helpdesk/issues/3737 which reports the dreaded error
Unresolveable build extension: Plugin org.jenkins-ci.tools:maven-hpi-plugin:XXX or one of its dependencies could not be resolved: The following artifacts could not be resolved: org.jenkins-ci.tools:maven-hpi-plugin:jar:XXX (absent): Could not find artifact org.jenkins-ci.tools:maven-hpi-plugin:jar:XXX in <some artifact caching proxy ID> (<some artifact caching proxy URL>) -> [Help 2]
when building on ci.jenkins.io. But:
settings.xml
(contributor laptop for instance) then it workspluginRepository
pointing to repo.jenkins-ci.org/public in the failing project's pom.xml
then it worksIt looks like that using a settings.xml
(like ci.jenkins.io agents) with a mirrorOf
directive (purpose of catching all donwload requests and send them to caching proy) does not behave as expected with the Maven plugin resolution.
But this file can be tuned to auto-enable a custom profile which defines the proper plugin repository resolution system (in the context of ci.jenkins.io). I was successful in a local attempt with the following block added to my settings.xml
(without it, I reproduce the error):
<profiles>
<profile>
<id>jenkins-infra-plugin-repositories</id>
<pluginRepositories>
<pluginRepository>
<id>repo.jenkins-ci.org</id>
<url>https://repo.jenkins-ci.org/public/</url>
</pluginRepository>
<pluginRepository>
<snapshots>
<enabled>false</enabled>
</snapshots>
<id>incrementals</id>
<url>https://repo.jenkins-ci.org/incrementals/</url>
</pluginRepository>
<pluginRepository>
<id>central</id>
<url>https://repo.maven.apache.org/maven2</url>
</pluginRepository>
</pluginRepositories>
</profile>
</profiles>
<activeProfiles>
<activeProfile>jenkins-infra-plugin-repositories</activeProfile>
</activeProfiles>
My proposal is to add this code block (only for pluginRepository as we have not any other error) in the existing settings.xml file on ci.jenkins.io.
WDYT @MarkEWaite @jnord @basil @daniel-beck ? Does it make sense and is it a valid assessment considering we want to remove the "Maven Central" mirrors from repo.jenkins-ci.org?
I've seen that error a few times and chased it away by disabling the artifact caching proxy. I suspect the artifact caching proxy is still caching some results from the last "experiment" and this problem would go away if the cache was cleared.
I've seen that error a few times and chased it away by disabling the artifact caching proxy. I suspect the artifact caching proxy is still caching some results from the last "experiment" and this problem would go away if the cache was cleared.
That was my initial though but I tried without cache and still had the error. However, cleaning up the cache at all is on the item list for the upcoming brownout!
The cache cleanup remove the following error though: [INFO] Artifact <some dependency path> is present in the local repository, but cached from a remote repository ID that is unavailable in current build context, verifying that is downloadable from [<list of repositories which order changes between different builds>]
The cache cleanup remove the following error though:
[INFO] Artifact <some dependency path> is present in the local repository, but cached from a remote repository ID that is unavailable in current build context, verifying that is downloadable from [<list of repositories which order changes between different builds>]
Ah yes, that is the error I was thinking of, not the "Unresolveable build extension" one.
Profile seems like a pragmatic solution.
Curious why this doesn't work though. Are plugin repos special wrt mirrors? Are we not mirroring plugins' special meta-metadata.xml
file?
Profile seems like a pragmatic solution.
Thanks for the quick reply folks!
Are we not mirroring plugins' special
meta-metadata.xml
file?
I did not know about this file. The ACP configuration does not cache the maven-metadata.xml
and directly serves it from Artifactory (https://github.com/jenkins-infra/helm-charts/blob/a38df338a0801899d7f54b78a62793042c5dc62d/charts/artifact-caching-proxy/templates/nginx-proxy-configmap.yaml#L57-L60) but for meta-metadata.xml
I believe it should be present. Worth digging this.
Curious why this doesn't work though. Are plugin repos special wrt mirrors?
Same here I don't understand the root cause and it is a bit frustrating. I have a gut feeling it is related to the mirrorOf
directive
but for
meta-metadata.xml
I believe it should be present. Worth digging this.
Oops, typo. I meant maven-metadata.xml
, but it's on the group level, not (just) the artifact level: https://repo.jenkins-ci.org/releases/org/jenkins-ci/tools/
Perhaps that makes a difference?
In relation with the bandwidth issue, here is the "next steps" proposal:
Update:
settings.xml
file deployed to ci.jenkins.io and infra.ci.jenkins.io.Update: proposed timeline for the next brownout https://github.com/jenkins-infra/status/pull/370
Brownout started:
*repo1*
mirrors from repo.jenkins-ci.org/publicNext steps: testing builds
Brownout is finished: closed status.jenkins.io and sent an email on the mailing list.
TL;DR; results are really good, we only have one last build issue in https://github.com/jenkinsci/maven-hpi-plugin/pull/529#issuecomment-1705463280 but not blocking the planning as it is only a matter of Integration tests and settings.xml
(edit) Detailled reports of what was done and tested during the brownout:
Artifactory: Removed repo1
and maven-repo1
mirror repositories from the public
virtual repository
"ACP" (Artifact Caching Proxy) inside the Jenkins infrastructure: cleaned up the cache for each replica
eks-public
in AWS EKS, doks-public
in Digital Ocean and publick8s
in AKS)rm -rf /data/nginx-cache/*
then delete the pod to force the nginx
process to be restartedTested the following jobs on ci.jenkins.io:
jenkins-infra-test-plugin
)nexus-platform-plugin
)[WARNING] Error resolving project artifact: The following artifacts could not be resolved: com.sun:tools:pom:1.8.0 (absent): Could not find artifact com.sun:tools:pom:1.8.0 in azure-proxy (https://repo.azure.jenkins.io/public/) for project com.sun:tools:jar:1.8.0
Tested the following jobs on trusted.ci.jenkins.io:
Artifactory: Added back repo1
and maven-repo1
mirror repositories in the public
virtual repository
"ACP" (Artifact Caching Proxy) inside the Jenkins infrastructure: cleaned up the cache for each replica
Update: next steps:
repo1
and maven-repo1
as private mirrorsWork on https://github.com/jenkinsci/maven-hpi-plugin/pull/537 is finished and merged thanks to the help of @basil .
It is a fix to allow running Integrations Tests by opt-ing out of ACP for this project (only).
Background work is needed for a clear fix (issue to track this in https://github.com/jenkinsci/maven-hpi-plugin/issues/541).
Next step to close this issue: let's wait from @MarkEWaite 's analysis of the logs provided by JFrog and confirm with them that the new bandwitdh usage is fine for them.
Done. Log file format changed and we've decided to not spend the effort to adapt to the changed log file format. Thanks to all!
Service(s)
Artifactory
Summary
JFrog has asked us to reduce the outbound bandwidth used by https://repo.jenkins-ci.org . One of the ideas being explored is to make several of the repository mirrors private. We need to test that by announcing and executing a series of time limited tests that temporarily make the repository mirrors private and assess the impact on Jenkins developers.
The proposed sequence of repositories to make private include:
The repositories in the list include a mix of large and small repositories with some that are known to be used for Jenkins development and others that are not as clear that they are used for Jenkins development.
Implementation plan
Announce the series of functionality reduction tests with each lasting for a relatively brief period (1 hour). Announce in
During the functionality reduction tests, we will specifically assess impact on