briantist / galactory

An Ansible Galaxy proxy for Artifactory
GNU General Public License v3.0
33 stars 7 forks source link

theforeman.foreman collection could not be installed when using Galactory as a proxy. #131

Open mshonichev opened 9 months ago

mshonichev commented 9 months ago

Hi there again!

We've found another fancy issue in our setup of Galactory as a proxy / inner source for Ansible collections.

Fast forward to root cause: storing collection metadata in the Artifactory file properties was a nice idea, however for some collections, specifically theforeman.foreman, metadata is such huge in size that it hits the Artifactory property value limit and thus is failed to properly deserialize.

Reproducer:

Starting galaxy collection install process Process install dependency map Starting collection install process Downloading http://galaxy.local/download/cisco-nxos-5.2.1.tar.gz to /Users/m.shonichev/.ansible/tmp/ansible-local-54036f2c1i_ng/tmpi7sor7ba/cisco-nxos-5.2.1-vjgrss63 Installing 'cisco.nxos:5.2.1' to '/home/m.shonichev/.collections/ansible_collections/cisco/nxos' cisco.nxos:5.2.1 was installed successfully


and some collections don't:

```shell
$ ansible-galaxy collection install theforeman.foreman:==3.15.0
Starting galaxy collection install process
Process install dependency map
ERROR! Error when getting available collection versions for theforeman.foreman from cmd_arg (http://galaxy.local/api) (HTTP Code: 500, Message: INTERNAL SERVER ERROR Code: Unknown)

container logs reveal stack trace:

INFO:galactory:{'X-Request-Id': '7dbd5b2cc0c40aeda98dd1f83fad97b9', 'X-Real-Ip': '10.0.0.2', 'X-Forwarded-For': '10.0.0.2', 'X-Forwarded-Host': 'galaxy.local', 'X-Forwarded-Port': '8080', 'X-Forwarded-Proto': 'http', 'X-Forwarded-Scheme': 'http', 'X-Scheme': 'http', 'Accept-Encoding': 'identity', 'User-Agent': 'ansible-galaxy/2.13.6 (Linux; python:3.10.13)', 'Accept': 'application/json, */*'}
Mon, Dec 11 2023 1:59:33 pm
INFO:galactory:Cache miss: http://galaxy.local/api/v3/collections/theforeman/foreman/
Mon, Dec 11 2023 1:59:33 pm
ERROR:galactory:Exception on /api/v3/collections/theforeman/foreman/ [GET]
Mon, Dec 11 2023 1:59:33 pm
Traceback (most recent call last):
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/flask/app.py", line 2190, in wsgi_app
Mon, Dec 11 2023 1:59:33 pm
    response = self.full_dispatch_request()
Mon, Dec 11 2023 1:59:33 pm
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/flask/app.py", line 1486, in full_dispatch_request
Mon, Dec 11 2023 1:59:33 pm
    rv = self.handle_user_exception(e)
Mon, Dec 11 2023 1:59:33 pm
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/flask/app.py", line 1484, in full_dispatch_request
Mon, Dec 11 2023 1:59:33 pm
    rv = self.dispatch_request()
Mon, Dec 11 2023 1:59:33 pm
         ^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/flask/app.py", line 1469, in dispatch_request
Mon, Dec 11 2023 1:59:33 pm
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
Mon, Dec 11 2023 1:59:33 pm
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/galactory/api/v3/collections.py", line 103, in collection
Mon, Dec 11 2023 1:59:33 pm
    colcol = CollectionCollection.from_collections(discover_collections(repo=repository, namespace=namespace, name=collection))
Mon, Dec 11 2023 1:59:33 pm
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/galactory/models.py", line 185, in from_collections
Mon, Dec 11 2023 1:59:33 pm
    for collection in collections:
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/galactory/utilities.py", line 120, in discover_collections
Mon, Dec 11 2023 1:59:33 pm
    coldata = CollectionData.from_artifactory_path(path=p, properties=props, stat=info)
Mon, Dec 11 2023 1:59:33 pm
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/venv/lib/python3.11/site-packages/galactory/models.py", line 34, in from_artifactory_path
Mon, Dec 11 2023 1:59:33 pm
    collection_info = json.loads(properties['collection_info'][0])
Mon, Dec 11 2023 1:59:33 pm
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
Mon, Dec 11 2023 1:59:33 pm
    return _default_decoder.decode(s)
Mon, Dec 11 2023 1:59:33 pm
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
Mon, Dec 11 2023 1:59:33 pm
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
Mon, Dec 11 2023 1:59:33 pm
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
Mon, Dec 11 2023 1:59:33 pm
    obj, end = self.scan_once(s, idx)
Mon, Dec 11 2023 1:59:33 pm
               ^^^^^^^^^^^^^^^^^^^^^^
Mon, Dec 11 2023 1:59:33 pm
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 2398 (char 2397)

The upstream collection was successfully downloaded from https://galaxy.ansible.com and stored in the Artifactory repository.

However, following are the properties of file (actually a tail of):

Screenshot 2023-12-11 at 14 56 46

Apparently, The Foreman team decides to write down all the seven tribes of relatives (cats included!) as a collection authors list and while Galactory tries to put collection_info into file properties, an 2400 chars limit was hit and JSON was corrupted.

Collection meta: https://github.com/theforeman/foreman-ansible-modules/blob/a69d83fb8681ca80878d8143d27594eb72ee295d/galaxy.yml#L4

From the Artifactory docs:

Property keys are limited up to 255 characters and property values are limited up to 2,400 characters. Using properties with values over this limit might cause backend issues.

I'm not sure that limit can be raised up easily, at least documentation is obscure about. Also I'm not sure if removing metadata from properties would not break the Galactory codebase to pieces.

So, currently we had to add upstream galaxy as a backup server as a workaround:

[ansible]
collections_path = .collections

[galaxy]
server_list = galactory, upstream
ignore_certs=true

[galaxy_server.galactory]
url=http://galaxy.local:8080

[galaxy_server.upstream]
url=https://galaxy.ansible.com
briantist commented 9 months ago

@mshonichev thank you again for such a detailed report, I really appreciate it.

For my own reference, here's where the 2400 character limit is described:


I'm glad the second upstream in ansible.cfg has worked so far.

I'll think on how to address this issue, perhaps with short and long-term options.

Probably the first thing I'm going to change is separating out parts of the collection info into separate properties; at least, the pieces that glactory itself needs. It's not a full solution necessarily in that a single property could still end up > 2400 chars, but that would be an even more extreme edge case. This is a breaking change, but it has an advantage of making native Artifactory searches more useful by adding more individual properties that could be used in searches.

A safe but messy approach is splitting collection info into multiple properties by length, essentially paginating it within the artifactory properties. This would be backward compatible with existing uploaded collections, though it wouldn't fix existing broken collections, which should be ok since they can be deleted and re-uploaded or re-proxied. Feels ugly though...

mshonichev commented 8 months ago

I did not realise that Galactory uses Artifactory as a search engine aside to storage backend. That's actually pretty cool. But yeah, if all we need is a lookup by name / version, storing whole blob in props is a bit awkward.

IMO, splitting props to pages is cringe. It adds a whole lot of cases when meta changes - e.g. authors added/removed.

Here, suppose we need to find a collection by its author. Let's break down backend storage in two layers, first is meta storage layer, second is actual collection tar.gz storage layer.

A first layer might be smth like

   _props/
      by_author/
        j.doe.json
        b.nigh.json
        ....

each individual <author.json> has props set to author name and email, so we can lookup by jf rt s query. And contents of json is a list of collection references, e.g.

 [
   {
    'theforeman.foreman': {'3.15.0': '/actual/path/to/collection/3.15.tar.gz'}
   }
]

So, to get the actual archive out of search results you'll need a second query.

It would too hit hard on corner cases to support such a structure, but at least it could be introduced without b/w compatibility break.

Remember, that 'broken' collection with bloated meta was never actually uploaded, so 'broken' props need no fix.

We just start up a whole new life aside, adding a few new search capabilities, and all previously uploaded files need no rehaul.