ansible / ansible-builder

An Ansible execution environment builder
Other
289 stars 93 forks source link

community.vmware fails to install #613

Closed antwacky closed 11 months ago

antwacky commented 11 months ago

I'm getting the below error when adding community.vmware to requirements.yml:

Running command:
  docker build -f context/Dockerfile -t blah.com/awx/awx-ee:0.6.5 context
...showing last 20 lines of output...
------
 > [galaxy 5/5] RUN ANSIBLE_GALAXY_DISABLE_GPG_VERIFY=1 ansible-galaxy collection install  -r requirements.yml --collections-path "/usr/share/ansible/collections":
0.855 Starting galaxy collection install process
0.855 Process install dependency map
2.412 [WARNING]: Skipping Galaxy server https://galaxy.ansible.com/api/. Got an
2.412 unexpected error when getting available versions of collection
2.412 community.vmware: '/api/v3/plugin/ansible/content/published/collections/index/c
2.412 ommunity/vmware/versions/'
2.412 to see the full traceback, use -vvv
2.412 ERROR! Unexpected Exception, this is probably a bug: '/api/v3/plugin/ansible/content/published/collections/index/community/vmware/versions/'
------
Dockerfile:36
--------------------
  34 |
  35 |     RUN ansible-galaxy role install $ANSIBLE_GALAXY_CLI_ROLE_OPTS -r requirements.yml --roles-path "/usr/share/ansible/roles"
  36 | >>> RUN ANSIBLE_GALAXY_DISABLE_GPG_VERIFY=1 ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path "/usr/share/ansible/collections"
  37 |
  38 |     # Builder build stage
--------------------
ERROR: failed to solve: process "/bin/sh -c ANSIBLE_GALAXY_DISABLE_GPG_VERIFY=1 ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path \"/usr/share/ansible/collections\"" did not complete successfully: exit code: 250

An error occurred (rc=1), see output line(s) above for details.

Running ansible-galaxy collection install community.vmware manually from command line completes successfully.

I've tried with different versions of ansible-core and ansible-builder, versions combinations tried:

ansible-core==2.14.0, 2.13.3, 2.15.0 with ansible-builder==3.0.0, 1.2.0

What am I doing wrong!

Akasurde commented 11 months ago

@antwacky Thanks for reporting this issue. Can you please provide the EE file used?

kurokobo commented 11 months ago

ansible-core==2.14.0, 2.13.3, 2.15.0

Could you try using the latest version of each 2.13.x, 2.14.x, or 2.15.x?

With the Next-Gen Ansible Galaxy, there have been many reports of issues installing collections in the ansible/awx repositories and forums.

Many seem to be resolved by using the newer ansible-core.

antwacky commented 11 months ago

Thanks for getting back to me, I've tried the latest ansible-core:

Package                   Version
------------------------- --------
ansible-builder           3.0.0
ansible-core              2.15.4

Which unfortunately fails with the same error. Using ansible-galaxy within the same venv locally installs successfully.

Here's my EE file:

---
version: 1
dependencies:
  galaxy: requirements.yml
  python: requirements.txt
  system: bindep.txt
additional_build_steps:
  append:
    - RUN alternatives --set python /usr/bin/python3
    - COPY --from=quay.io/project-receptor/receptor:1.0.0a2 /usr/bin/receptor /usr/bin/receptor
    - RUN mkdir -p /var/run/receptor
    - COPY --from=hashicorp/terraform:latest /bin/terraform /usr/bin/terraform
    - ADD run.sh /run.sh
    - CMD /run.sh
    - RUN git lfs install

Requirements:

---
collections:
  - name: amazon.aws
  - name: ansible.netcommon
  - name: ansible.posix
  - name: ansible.windows
  - name: awx.awx
  - name: azure.azcollection
  - name: community.aws
  - name: community.docker
  - name: community.general
  - name: google.cloud
  - name: kubernetes.core
  - name: openstack.cloud
  - name: ovirt.ovirt
  - name: redhatinsights.insights
  - name: theforeman.foreman
  - name: community.hashi_vault
  - name: community.vmware
ekarlso commented 11 months ago

I am getting the same kind of error except that mine fails on community.general

Starting galaxy collection install process
Process install dependency map
ERROR! Unexpected Exception, this is probably a bug: find_matches() got an unexpected keyword argument 'identifier'                                                                          
the full traceback was:

Traceback (most recent call last):
  File "/usr/local/bin/ansible-galaxy", line 128, in <module>
    exit_code = cli.run()
  File "/usr/local/lib/python3.8/site-packages/ansible/cli/galaxy.py", line 567, in run
    return context.CLIARGS['func']()
  File "/usr/local/lib/python3.8/site-packages/ansible/cli/galaxy.py", line 86, in method_wrapper                                                                                            
    return wrapped_method(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/ansible/cli/galaxy.py", line 1201, in execute_install                                                                                         
    self._execute_install_collection(
  File "/usr/local/lib/python3.8/site-packages/ansible/cli/galaxy.py", line 1228, in _execute_install_collection                                                                             
    install_collections(
  File "/usr/local/lib/python3.8/site-packages/ansible/galaxy/collection/__init__.py", line 513, in install_collections                                                                      
    dependency_map = _resolve_depenency_map(
  File "/usr/local/lib/python3.8/site-packages/ansible/galaxy/collection/__init__.py", line 1327, in _resolve_depenency_map                                                                  
    return collection_dep_resolver.resolve(
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 546, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 397, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 148, in _add_to_criteria                                                                                       
    matches = self._p.find_matches(
TypeError: find_matches() got an unexpected keyword argument 'identifier'
Error: building at STEP "RUN ANSIBLE_GALAXY_DISABLE_GPG_VERIFY=1 ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path "/usr/share/ansible/collections"": while running runtime: exit status 250
ekarlso commented 11 months ago

@kurokobo any idea on this?

kurokobo commented 11 months ago

@antwacky

Thanks for getting back to me, I've tried the latest ansible-core:

Package                   Version
------------------------- --------
ansible-builder           3.0.0
ansible-core              2.15.4

Which unfortunately fails with the same error. Using ansible-galaxy within the same venv locally installs successfully.

Ah sorry for my lack of clarification. My recommendation is upgrading Ansible inside the EE instead of your local Ansible. During build, ansible-galaxy is invoked inside the container, so ansible-core on your local environment is not required and never be used to run ansible-builder command.

---
version: 1

If you are using version 1 syntax for execution-environment.yml without specifying EE_BASE_IMAGE , quay.io/ansible/ansible-runner:latest is used as base image, but this image is no longer maintained and contains old ansible 2.12.5.post0 that can't work with new Ansible Galaxy.

$ docker run --rm -it quay.io/ansible/ansible-runner:latest ansible --version
ansible [core 2.12.5.post0]
  config file = None
  configured module search path = ['/home/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.8/site-packages/ansible
  ansible collection location = /home/runner/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.8.12 (default, Sep 21 2021, 00:10:52) [GCC 8.5.0 20210514 (Red Hat 8.5.0-3)]
  jinja version = 2.10.3
  libyaml = True

I can reproduce your issue on my side and also can confirm that upgrading Ansible inside the EE solves the issue.

You can specify ansible-core version by using version 3 syntax for execution-environment.yml, so I recommend you to migrate your execution-environment.yml to version 3.

---
version: 3
images:
  base_image:
    name: quay.io/ansible/ansible-runner:latest
dependencies:
  ansible_core:
    package_pip: ansible-core==2.12.10     πŸ‘ˆπŸ‘ˆπŸ‘ˆ
...

If you don't want to migrate your file, as a temporary workaround, add pip install -U ansible-core==2.12.10 before ansible-galaxy to your context/Dockerfile,

# Galaxy build stage
...
RUN pip install -U ansible-core==2.12.10     πŸ‘ˆπŸ‘ˆπŸ‘ˆ
RUN ansible-galaxy role install $ANSIBLE_GALAXY_CLI_ROLE_OPTS -r requirements.yml --roles-path "/usr/share/ansible/roles"
RUN ANSIBLE_GALAXY_DISABLE_GPG_VERIFY=1 ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path "/usr/share/ansible/collections"
...

or specify source: https://old-galaxy.ansible.com for each collections in your requirements.yml.

---
collections:
  ...
  - name: community.general
    source: https://old-galaxy.ansible.com     πŸ‘ˆπŸ‘ˆπŸ‘ˆ
  ...
  - name: community.vmware
    source: https://old-galaxy.ansible.com     πŸ‘ˆπŸ‘ˆπŸ‘ˆ
  ...

Anyway, this is not a bug for Ansible Builder, so if you have further questions, you should contact Ansible Community Forum.

@ekarlso The same idea is applicable for you. Try upgrading Ansible in the EE πŸ˜ƒ

antwacky commented 11 months ago

Thanks, I've gotten past that now by changing EE to v3 as recommended.

I have another error now:

#29 84.99   Γ— python setup.py egg_info did not run successfully.
#29 84.99   β”‚ exit code: 1
#29 84.99   ╰─> [23 lines of output]
#29 84.99       Traceback (most recent call last):
#29 84.99         File "/tmp/pip-install-b90y06_7/pycurl_8bf5fcd0301f4c5d8688bc709d2605be/setup.py", line 229, in configure_unix
#29 84.99           p = subprocess.Popen((self.curl_config(), '--version'),
#29 84.99         File "/usr/lib64/python3.8/subprocess.py", line 858, in __init__
#29 84.99           self._execute_child(args, executable, preexec_fn, close_fds,
#29 84.99         File "/usr/lib64/python3.8/subprocess.py", line 1704, in _execute_child
#29 84.99           raise child_exception_type(errno_num, err_msg, err_filename)
#29 84.99       FileNotFoundError: [Errno 2] No such file or directory: 'curl-config'
#29 84.99
#29 84.99       During handling of the above exception, another exception occurred:
#29 84.99
#29 84.99       Traceback (most recent call last):
#29 84.99         File "<string>", line 2, in <module>
#29 84.99         File "<pip-setuptools-caller>", line 34, in <module>
#29 84.99         File "/tmp/pip-install-b90y06_7/pycurl_8bf5fcd0301f4c5d8688bc709d2605be/setup.py", line 970, in <module>
#29 84.99           ext = get_extension(sys.argv, split_extension_source=split_extension_source)
#29 84.99         File "/tmp/pip-install-b90y06_7/pycurl_8bf5fcd0301f4c5d8688bc709d2605be/setup.py", line 634, in get_extension
#29 84.99           ext_config = ExtensionConfiguration(argv)
#29 84.99         File "/tmp/pip-install-b90y06_7/pycurl_8bf5fcd0301f4c5d8688bc709d2605be/setup.py", line 93, in __init__
#29 84.99           self.configure()
#29 84.99         File "/tmp/pip-install-b90y06_7/pycurl_8bf5fcd0301f4c5d8688bc709d2605be/setup.py", line 234, in configure_unix
#29 84.99           raise ConfigurationError(msg)
#29 84.99       __main__.ConfigurationError: Could not run curl-config: [Errno 2] No such file or directory: 'curl-config'
#29 84.99       [end of output]
#29 84.99
#29 84.99   note: This error originates from a subprocess, and is likely not a problem with pip.
#29 84.99 error: metadata-generation-failed

Some googling says I need a couple of packages, so I added them to bindeps like this

libcurl-devel [platform:rpm compile]
openssl-devel [platform:rpm compile]

However I still get the same error. Any ideas? Thanks.

antwacky commented 11 months ago

For completeness, my full bindeps is:

libcurl-devel [platform:rpm compile]
openssl-devel [platform:rpm compile]
python38-devel [platform:rpm compile]
subversion [platform:rpm]
subversion [platform:dpkg]
git-lfs [platform:rpm]
Akasurde commented 11 months ago

@antwacky Please use different images since quay.io/ansible/ansible-runner:latest is not updated. For example -

    #   - quay.io/rockylinux/rockylinux:9
    #   - quay.io/centos/centos:stream9
    #   - registry.fedoraproject.org/fedora:38

Take a look at https://ansible.readthedocs.io/projects/builder/en/latest/definition/#options for more information.

kurokobo commented 11 months ago

@antwacky Maybe EE file for awx-ee that uses quay.io/centos/centos:stream9 as its base image is good starting point for you: https://github.com/ansible/awx-ee/blob/devel/execution-environment.yml

antwacky commented 11 months ago

Thanks that seems to be working. I'm now coming across another problem whereby the pip dependency resolver is taking a very long time for each package, I actually left it over night and it seems to have eventually stopped.

Is there a way to speed this up?

#29 1270.9 INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking
#29 1271.0   Downloading kubernetes-12.0.1-py2.py3-none-any.whl (1.7 MB)
#29 1362.6   Downloading kubernetes-12.0.0-py3-none-any.whl (1.7 MB)
#29 1455.8 INFO: pip is looking at multiple versions of google-cloud-storage to determine which version is compatible with other requirements. This could take a while.
#29 1455.8 Collecting google_cloud_storage
#29 1455.8   Downloading google_cloud_storage-2.10.0-py2.py3-none-any.whl (114 kB)

I see about adding stricter constraints, should I be adding the constraints manually to the requirements.txt?

antwacky commented 11 months ago

I've slowly started adding the requirements that dependency resolver worked out manually to requirements.txt, getting further each time. I guess I'll close this as it's down to pip rather than builder.

Thanks for your help.