pulp / pulpcore

Pulp 3 pulpcore package https://pypi.org/project/pulpcore/
GNU General Public License v2.0
285 stars 112 forks source link

Sync creates publication but no new repository version #3450

Open fao89 opened 2 years ago

fao89 commented 2 years ago

Author: mgoddard (mgoddard)

Redmine Issue: 9651, https://pulp.plan.io/issues/9651


I have a nightly job that uses Ansible Squeezer modules to synchronise, publish and distribute some repositories. Every few days I hit an error like this:

Found multiple matches for publication ({'repository_version': '/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/

I have verified that this is the case. There is one publication created after the last successful sync, and another created today for the same version.

 pulp rpm publication list --repository-version /pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/
[
  {
    "pulp_href": "/pulp/api/v3/publications/rpm/rpm/21680308-1fc4-4bea-a5fc-1e3c609533f1/",
    "pulp_created": "2021-12-21T02:31:44.243390Z",
    "repository_version": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/",
    "repository": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
    "metadata_checksum_type": "unknown",
    "package_checksum_type": "unknown",
    "gpgcheck": 0,
    "repo_gpgcheck": 1,
    "sqlite_metadata": true
  },
  {
    "pulp_href": "/pulp/api/v3/publications/rpm/rpm/4a0c6a75-bb86-4b96-bb6c-6d0f08763847/",
    "pulp_created": "2021-12-15T02:31:07.260722Z",
    "repository_version": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/",
    "repository": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
    "metadata_checksum_type": "unknown",
    "package_checksum_type": "unknown",
    "gpgcheck": 0,
    "repo_gpgcheck": 1,
    "sqlite_metadata": true
  }
]

I checked the sync task from today, and it completed successfully. However, it lists the new publication as a created resource, but no new repo version.

  {
    "pulp_href": "/pulp/api/v3/tasks/5e132510-89cb-4224-9966-d1f22d49a4e1/",
    "pulp_created": "2021-12-21T02:30:53.723988Z",
    "state": "completed",
    "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
    "logging_cid": "0a4dc729907842aaa5ba9605e418cdd4",
    "started_at": "2021-12-21T02:30:53.801940Z",
    "finished_at": "2021-12-21T02:31:44.897653Z",
    "error": null,
    "worker": "/pulp/api/v3/workers/605f92b7-9b71-4039-a3de-0af017d86651/",
    "parent_task": null,
    "child_tasks": [],
    "task_group": null,
    "progress_reports": [
      {
        "message": "Downloading Metadata Files",
        "code": "sync.downloading.metadata",
        "state": "completed",
        "total": null,
        "done": 8,
        "suffix": null
      },
      {
        "message": "Downloading Artifacts",
        "code": "sync.downloading.artifacts",
        "state": "completed",
        "total": null,
        "done": 297,
        "suffix": null
      },
      {
        "message": "Associating Content",
        "code": "associating.content",
        "state": "completed",
        "total": null,
        "done": 0,
        "suffix": null
      },
      {
        "message": "Parsed Packages",
        "code": "sync.parsing.packages",
        "state": "completed",
        "total": null,
        "done": 299,
        "suffix": null
      },
      {
        "message": "Un-Associating Content",
        "code": "unassociating.content",
        "state": "completed",
        "total": null,
        "done": 0,
        "suffix": null
      }
    ],
    "created_resources": [
      "/pulp/api/v3/publications/rpm/rpm/21680308-1fc4-4bea-a5fc-1e3c609533f1/"
    ],
    "reserved_resources_record": [
      "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
      "shared:/pulp/api/v3/remotes/rpm/rpm/7b6bc03e-787e-4266-ba33-425c4f9e540b/"
    ]
  },

Comparing with another sync task, I see a repository version listed in the created_resources instead.

Here is one of the affected repos:

{
  "pulp_href": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
  "pulp_created": "2021-11-19T13:21:20.971989Z",
  "versions_href": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/",
  "pulp_labels": {},
  "latest_version_href": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/",
  "name": "CentOS Stream 8 - NFV OpenvSwitch",
  "description": null,
  "retain_repo_versions": null,
  "remote": null,
  "autopublish": false,
  "metadata_signing_service": null,
  "retain_package_versions": 0,
  "metadata_checksum_type": null,
  "package_checksum_type": null,
  "gpgcheck": 0,
  "repo_gpgcheck": 0,
  "sqlite_metadata": false
}

And the corresponding remote:

  {
    "pulp_href": "/pulp/api/v3/remotes/rpm/rpm/7b6bc03e-787e-4266-ba33-425c4f9e540b/",
    "pulp_created": "2021-11-19T13:21:41.147452Z",
    "name": "CentOS Stream 8 - NFV OpenvSwitch-remote",
    "url": "http://mirrorlist.centos.org/?release=8-stream&arch=x86_64&repo=nfv-openvswitch-2",
    "ca_cert": null,
    "client_cert": null,
    "tls_validation": true,
    "proxy_url": null,
    "pulp_labels": {},
    "pulp_last_updated": "2021-11-19T13:21:41.147492Z",
    "download_concurrency": null,
    "max_retries": null,
    "policy": "immediate",
    "total_timeout": null,
    "connect_timeout": null,
    "sock_connect_timeout": null,
    "sock_read_timeout": null,
    "headers": null,
    "rate_limit": null,
    "sles_auth_token": null
  },

I'm using policy: immediate and sync_policy: mirror_complete when syncing.

Versions:

    {
      "component": "core",
      "version": "3.16.0"
    },
    {
      "component": "rpm",
      "version": "3.16.1"
    },
    {
      "component": "file",
      "version": "1.10.1"
    },
    {
      "component": "deb",
      "version": "2.16.0"
    },
    {
      "component": "container",
      "version": "2.9.0"
    },
    {
      "component": "certguard",
      "version": "1.5.1"
    }
dralley commented 2 years ago

@markgoddard This isn't necessarily a bug, it could just be that some minor detail about the metadata changed without actually changing the content of the repo. In mirror_complete mode, it'll download the new metadadata and create a publication with that, but there won't be a new repository version.

I can't immediately rule out that it's definitely not a bug but this is fairly likely to be the reason why. You'd have to compare the metadata before and after.

I'll keep it open though because I just had an idea for improving the heuristic. In the event that you change some detail on the remote from on_demand to immediate mode it'll trigger a full sync just like if the metadata changed - and it does need to in order to download the packages - but in doing so it'll probably also trigger the new publication, and it doesn't need to do that.

Probably not relevant to your problem but still something we could improve.

markgoddard commented 2 years ago

@dralley thanks for the explanation. I suppose that makes sense, assuming that no metadata is stored in the repository version.

So that leads us to a second issue, which is that squeezer is unable to uniquely identify publications when there is more than one for a given repository version. See https://github.com/pulp/squeezer/pull/85#pullrequestreview-803557737

cc @mdellweg

markgoddard commented 2 years ago

This also means my workaround of simply deleting the new publication is sub-optimal, and I should instead point existing distributions to the new publication before deleting the old one?

dralley commented 1 year ago

Marking as triage-needed to take another look

ipanova commented 1 year ago

https://discourse.pulpproject.org/t/design-question-why-do-publications-not-have-a-unique-name/300/5