jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
17 stars 11 forks source link

Remove `jcenter`, and `oss.sonatype.org-releases` repositories from `public` virtual repository; reconfigure Atlassian remote repositories #3842

Closed MarkEWaite closed 11 months ago

MarkEWaite commented 1 year ago

Service(s)

Artifactory

Summary

The data transfer volume from https://repo.jenkins-ci.org is still unexpectedly high. Log file analysis for a recent 5 day period shows that the top repositories based on the sum of the sizes of the requested files are:

Repository Sum of GB requested
releases 2380
jcenter-cache 1787
incrementals 298
jcenter 98
nodejs-dist-cache 78
public 26
jgit-cache 14
oss.sonatype.org-releases-cache 8
snapshots 8
npm-dist-cache 5

JFrog has explained that even though we removed Apache Maven Central from the public repository, the contents of Apache Maven Central still exist as copies in the jcenter repository. We need to also remove the jcenter repository and the jcenter-cache repository from the definition of the public repository so that binaries from Apache Maven Central will be resolved from Apache Maven Central instead of being resolved from our repository.

When I had tried some initial experiments removing jcenter, several months ago, there were some cases where it appeared that we depend on artifacts that are only in jcenter and possibly only in our cache of jcenter. Unfortunately, my memory is weak and I cannot find any notes that reference that experiment.

I think that we should take the following steps:

Reproduction steps

Steps that I took to analyze the requests:

  1. Request Artifactory log files from JFrog
  2. Update my artfiactory-sql fork to handle the new format of Artifactory log files
  3. Upload the log files to a sqlite3 database
  4. Identity top repositories with the SQL query select repo,sum(size_bytes)/1024/1024/1024 as SUM_GB from logs group by repo order by SUM_GB desc limit 10;
dduportal commented 11 months ago
  • Removed jcenter-cache and atlassian-cache from both Any Remote and Anything permissions scheme to ensure they cannot be reached anonymously (as such the outbound bandwidth should drastically decrease)

Thanks very much. Should the same change be applied to oss.sonatype.org-releases since it also contains a copy of Apache Maven Central?

The repository oss.sonatype.org-releases is no longer available anonymously: it requires authentication.

dduportal commented 11 months ago

Update: