ansible / ansible

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com.
https://www.ansible.com/
GNU General Public License v3.0
62.49k stars 23.83k forks source link

add: new option GALAXY_COLLECTIONS_INSTALL_PATH #83943

Open mrjk opened 2 weeks ago

mrjk commented 2 weeks ago
SUMMARY

Add a new option GALAXY_COLLECTIONS_INSTALL_PATH that allows to specify an install path for collection, instead of using COLLECTIONS_INSTALL_PATH.

ISSUE TYPE
ADDITIONAL INFORMATION

For exemple, on a local workstation, while developping collections in ~/prj/ansible_collections, you can safely run the ansible-galaxy collection install --force wihtout fearing of overwriting your ongoing work in ~/prj/ansible_collections. Exemple of config with environment vars:

export ANSIBLE_COLLECTIONS_PATH=~/prj/ansible_collections:~/.ansible/collections:/usr/share/ansible/collections
export ANSIBLE_GALAXY_COLLECTIONS_INSTALL_PATH=~/.ansible/collections

Without this setup, a ansible-galaxy collection install --force would have overwrite your ~/prj/ansible_collections, which is ... sad if you did not pushed your work.

No output change, but this can be tested this way:

ansible-galaxy collection install --force -r requirements.yml
bcoca commented 2 weeks ago

or use -p?

mrjk commented 2 weeks ago

@bcoca yep, but if you forget it, you may erase you local work.

Maybe I could ask how ansible collection developpers does to work on a collection that require internal collections and community collections at the same time ? How do you use ansible-galaxy to have a nice development workflow, supporting generic galaxy dependencies and on progress developpement ?

On my side, I tried with symlinks, but ansible-galaxy crashes if it met a symlink. Overriding the COLLECTION_PATH environment may lead to local env destruction. What are the other alternatives ?

ansibot commented 2 weeks ago

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

lib/ansible/cli/galaxy.py:713:5: E125: continuation line with same indent as next logical line

click here for bot help

bcoca commented 2 weeks ago

Even the change you ask for is not a way to completely avoid clobbering/masking collections during development, the best way is to come up with a convention and follow it.

There is no 'ONE' workflow, personally I use git to create the workspace for collections and symlink to ~/.ansible/collections or adjacent to a test play and only use ansibe-galaxy to install 'production' level collections.

mrjk commented 2 weeks ago

@bcoca I dunno if it's the good place to discuss about that, but this is the whole point of this PR. I have to admit this is not the best patch I've done, but after reading ansible source code, that was the quickest way to achieve what I wanted to do. But the real problem is more development workflow than anything else.

I tried many "conventions" without being very happy. The symlink approach was the first one I tried, but when Ansible-Galaxy meet a symlink, it misarably crash (Or I make a PR to fix this bug and see if my initial workflow can work with symlink). But still, it seems to be very hackish in the end to me. Also, I need something simple enough to be easily communicated to my team; a convention is totally acceptable, but not a 10 pages convention that must explains all edge cases and hacks. It should be like on or two commands at max, that can be easily run and reproducted.

If I can sum up my question: How to develop local collection without fearing data lost with the ansible-galaxy collection install --force command ? How to manage at the same time WIP local collections and upstream collections ?

This is the ultimate question to answer, and maybe there is no easy solution, but I guess this question is super important, because no-one talk about this. Even in the official Ansible documentation.

Implementation notes:

bcoca commented 2 weeks ago

This pr is equivalent to just adding an extra path when running to the existing config, so it does nothing to prevent clobbering aside from the convention of not adding the path when you don't want to clobber, this is already the case with the existing configuration options.

I'm not sure what the 'symlink' bug or crash is, if you are experiencing this please open a ticket with a reproducer and full error output.

s-hertel commented 1 week ago

ansible-galaxy only ever installs to the first path in the COLLECTIONS_PATHS config (or -p if it's provided). For example, if I have amazon.aws installed at /bar/ansible_collections/amazon/aws, and COLLECTIONS_PATHS is configured as /foo:/bar, installing amazon.aws with --force will not clobber the existing install. Instead, it will install the new collection at /foo/ansible_collections/amazon/aws, and leave /bar/ansible_collections/amazon/aws alone.

I don't think adding a new configuration option will solve the element of human error. The new configuration option doesn't add new functionality except skipping missing paths, which imo conflicts with expectation since ansible-galaxy creates the first path in COLLECTIONS_PATHS when it doesn't exist. I imagine that the list command would also need to consider GALAXY_COLLECTIONS_INSTALL_PATH since it may contain paths not in COLLECTIONS_PATHS.

The existing solution is to ensure the first path in COLLECTIONS_PATHS is safe for ansible-galaxy to use, and use -p to intentionally install to any of the normally read-only paths. I have collections installed in all of my COLLECTIONS_PATHS, PYTHONPATH, as well as playbook adjacent collections. I keep stuff in development in playbook-adjacent paths since I don't use ansible-galaxy to manage these at all (but could be managed, using -p).

Please follow up on the symlinks issue, I'd fix that and add tests to prevent a regression.

mrjk commented 6 days ago

Thank you @s-hertel , I feel you better described the problem than me. I also agree with your point, this is probably the best way to handle this problem.

On your first paragraph, you said: Instead, it will install the new collection at /foo/ansible_collections/amazon/aws, and leave /bar/ansible_collections/amazon/aws alone. This is the issue I hit when tickering around, and this was the best I could get. But I feel this solution was a bit clunky, and this why I came with this PR.

Finally, I also tried to play with PYTHONPATH, it was a promising solution, however I discovered that putting modules in PYTHONPATH was giving different results that putting them in COLLECTIONS_PATHS (especially on the file path lookup mechanisms, and maybe other thing ?). I'm also wondering if it solve the above problem (about loading order). I tried this solution few weeks ago, maybe I should try it back as you mention this could be achieved this way. I'll try it when I'll have some spare time.

My solution is definitely not the best, but I feel this "workflow" issue should be addressed at some point, at least mentioned in the Ansible documentation.

I'll make a bug report for the symlink issue I had.

s-hertel commented 5 days ago

I also agree with your point, this is probably the best way to handle this problem.

I'm not sure we're on the same page. I think adding a third way to configure the install path makes the UX worse, it's not simplifying things. The implementation of the new option also makes it easier to clobber an existing install, because configuring a fresh install path (that doesn't yet exist) will be skipped instead of created.

Maybe adding a --dry-run/--check-mode option instead could help, so the changes that would be made by ansible-galaxy can be validated before carrying through with it.

This is the issue I hit when tickering around, and this was the best I could get. But I feel this solution was a bit clunky, and this why I came with this PR.

This allows people to pip install the Ansible package and also manage some other collections with ansible-galaxy without breaking their install.

Finally, I also tried to play with PYTHONPATH, ...

If you pip install the Ansible package, the collections included in the package are installed to the PYTHONPATH. It has less priority than the COLLECTIONS_PATH, and has its own toggle https://docs.ansible.com/ansible/latest/reference_appendices/config.html#collections-scan-sys-path. It doesn't make sense to put collections here unless you plan on managing them with a Python package manager instead of ansible-galaxy.