ansible-community / community-topics

[Moved to Ansible Forum] Discussions for Ansible Community Meetings
https://docs.ansible.com/ansible/devel/community/steering/community_steering_committee.html#community-topics-triage
GNU General Public License v3.0
35 stars 9 forks source link

How to mark private plugins in a collection (with a proposal and implementation) #154

Closed felixfontein closed 4 months ago

felixfontein commented 1 year ago

Summary

While working on community.sops EE support (https://github.com/ansible-collections/community.sops/pull/98) I came around the need to have a 'private' filter plugin for a role in a collection. (Collection roles cannot have included plugins, as opposed to standalone roles, they have to be part of the collection itself. And thus are shown both by ansible-doc --list --type filter and show up on the collection's docsite plugin index.) There are various reasons for plugins in a collection to be private, and this topic isn't about discussing these, but about how this could be done :)

My first idea was to have a documentation flag private: true to mark plugins as private, which could be used by ansible-doc --list (it shows short_description, so it looks at the documentation anyway), antsibull-docs, and other tools. Unfortunately this turned out to be more tricky than expected: ansible-doc uses some special code to quickly read short_description from plugins which is mainly working well because every plugin documentation has short_description, and it usually shows up before the plugin's options. A similar trick won't work for private, which would mean that either ansible-doc --list won't care about private plugins (and still show them), or that listing plugins becomes more inefficient (which has a very low chance of being merged IMO).

Another idea is to classify plugins as private by their filename, or more precisely plugin name (the distinction is mainly important for test and filter plugins, where you have multiple plugins per file). The most obvious convention is using a underscore (_) prefix, as this is also the classic Python convention for marking something as private. Unfortunately, this convention is already partially used. Both ansible-doc --list and ansible-test sanity --test validate-modules interpret the leading underscore as "this plugin is deprecated", and validate-modules interprets it the same way as whether the plugin would have been depecated in meta/runtime.yml. All other parts of ansible-core ignore this underscore, including looking at the plugin docs with ansible-doc will show no indication that the module is deprecated (if you don't add the regular deprecation information to the plugin's documentation as well).

So my proposal would be:

What this discussion should be about:

  1. What do you think of the above proposal?
  2. If you don't like the leading underscore part of it, can you think of a better way to do this?
  3. Are you aware of other tools that have to be adjusted accordingly? (Maybe ansible-navigator, AH, ...?)
  4. Are you aware of other tools that already interpret a leading underscore (as deprecated, private, ...)?
felixfontein commented 1 year ago

Implementations of my proposal:

maxamillion commented 1 year ago

+1 for the _ prefix in the filename marking is as private.

briantist commented 1 year ago

Definite +1 on having a way to have private content.

It's a shame about not being able to use private: true, I like that much better. Leading underscore is ugly and I don't really like this property being part of a name.

maxamillion commented 1 year ago

Let me clarify for posterity: +1 for the idea of being able to mark things private no matter the implementation method

Also +1 for the _ prefix as mentioned above. I like it purely because underscores have special meaning in python and I'm used to that. Also, it makes it obvious at first glance when looking at the file listing of a Collection without having to open any files or anything to determine the state of the file. YMMV.

mariolenz commented 1 year ago

I think this is a good idea.

I also think that using _ as a prefix for private modules makes sense. Most collections are implemented in Python, and there a leading _ also indicates something is private (although it's not enforced in Python afaik). So it would feel kind of "natural" to most collection developers.

Andersson007 commented 1 year ago

+1 for the _ prefix in the filename for private stuff

briantist commented 1 year ago

I'll also clarify why I'm not so keen on the underscore _ prefix.

While Ansible is written in Python, and most modules are as well, a user of Ansible does not necessarily, and should not, need to understand Python conventions to use it. The convention has no specific semantic meaning that's obvious in the context of Ansible.

In this way we would be adding meta meaning to a plugin or module's name, where there is no such implicit special meaning in these names generally.

This can be a difficult point for us to grok since we're all pretty familiar with Python and use it so much to write ansible content. We should be careful about equating Ansible and Python, and about assuming Ansible users are Python programmers.


The one exception is our existing convention for module_utils to be private by starting with an underscore. I also don't feel great about that one but it makes more sense because in plugins and in python-based modules, the utils are imported as python packages (it's more awkward in PowerShell module utils but that ship has sailed).

gotmax23 commented 1 year ago

From today's community meeting:

#info There's some agreement that there should be a standardized way to mark content in a collection as private. If we proceed, the next steps are to decide whether to use the _ prefix approach or a metadata based approach, discuss this with core, and get the changes into ansible-core and antsibull-docs to hide private plugins from ansible-doc -l and the docsite.

felixfontein commented 1 year ago

Based on review feedback in https://github.com/ansible/ansible/pull/79218 I created a new PR for ansible-core in https://github.com/ansible/ansible/pull/79370, which allows to mark plugins as private in meta/runtime.yml.

russoz commented 1 year ago

Howdy! I absolutely like the idea of having private plugins, and I like the idea of using _ in the name to indicate that. The one thing that bugs me in that solution is that it creates two different semantics for the same "idiom" within the context of Ansible: in core, it means deprecation, whilst in collections it means private. I don't like these ambiguities. For most of us working with Ansible for a while, that will be a little annoyance that has to be remembered, but for newcomers that could generate confusion and frustration.

That said, I would prefer the metadata way.

felixfontein commented 1 year ago

https://github.com/ansible-community/antsibull-docs/pull/65 implements the metadata solution for antsibull-docs. (Will work also with older ansible-core versions.)

felixfontein commented 1 year ago

The core team decided to not accept that PR, and in general they do not want to allow ansible-doc to not list all plugins by default (see the statement here: https://github.com/ansible/ansible/pull/79370#issuecomment-1404251346).

felixfontein commented 1 year ago

Ok. We cannot get support for this in ansible-doc, but we can get support for this in other projects, like antsibull-docs and antsibull-changelog. So IMO we should define a standard and try to get it implemented in multiple projects (like ansible-lint could warn if you use a private plugin from a collection someplace else).

I want to start two votes on this:

  1. Whether we want to actually standardize this.
  2. What the standard should be.

What do you think about this? I'd especially like to hear about this from @ansible-community/steering-committee.

To make 2 feasible, we need to limit the ways to do this down to a small set of choices (preferably a small number like 2 or 3). Right now we have:

  1. Use a leading underscore (_) in the plugin's name.
  2. Use private: true in meta/runtime.yml's plugin information (next to deprecation, tombstone, and redirects).

Are there any other ideas? Like other names than private?

felixfontein commented 1 year ago

Also please take a look at the internal filter showing up in the community.sops collection: https://docs.ansible.com/ansible/latest/collections/community/sops/#filter-plugins - the idea of this issue is to prevent things like that showing up in the public documentation.

briantist commented 1 year ago

I'll reiterate that I'm against the underscore _ prefix (previous comment), and I like option 2, the metadata approach.

rochacbruno commented 1 year ago

I like the option 2 private: true as it is more explicit and would not conflict with _ already being in use.

felixfontein commented 1 year ago

There is an active vote on this topic - namely which way to use - in #204

CC @ansible-community/steering-committee

felixfontein commented 1 year ago

We did talk more about this (mostly on 'whether we should', and less about 'how to do it') at today's community meeting (https://meetbot.fedoraproject.org/ansible-community/2023-03-01/ansible_community_meeting.2023-03-01-19.01.html).

It was mentioned during the meeting that someone said that this "breaks the premise of auditability". Unfortunately the context of that statement isn't clear to us, but I would counter this with that ansible-doc should have an option which forces listing all plugins/modules, i.e. that the private ones are only not shown by default.

It was also pointed out that other tools, such as galaxy/galaxy_ng, will likely stick to what ansible-doc shows, so their plugin listings will still list these plugins/modules even when the docsite does not. This was countered with the argument that it's better to at least have some tools/docsite which do not list the private plugins, instead of having them listed everywhere.

When discussing how to do this, there were both proponents of filename marking (leading _) and metadata marking (private: true in meta/runtime.yml). Some folks also asked about having it in the plugin/module's DOCUMENTATION (similar to short_description and version_added), but that has potential performance implications (it makes listing of all plugins/modules less efficient). (I added a bit more detail on this in the vote description itself.)

(Please look up the meeting log for more details on what exactly was discussed, if you want to know more.)

maxamillion commented 1 year ago

The more I think about this, I'm +1 on the private: true in meta/runtimes.yml

ssbarnea commented 1 year ago

I am very happy that this was raised as I was aware of the issue since collections were introduced.

Added ticket at https://issues.redhat.com/browse/AAP-9875 as decision on this would have a broader ecosystem impact, as devtools will need to recognize private modules and probably we would also need some minor changes in core, ansible-doc in particular.

acozine commented 1 year ago

I think using the Python convention of marking modules private by starting the filename _ could be a) confusing in Ansible because we've used _ to mean deprecated elsewhere and b) not very welcoming to new users and developers who don't already know the Python convention. I'd definitely prefer marking modules private with private: true in the runtime.yml metadata - it's unambiguous.

felixfontein commented 1 year ago

I'm copying @bcoca's comment (https://github.com/ansible-community/community-topics/discussions/204#discussioncomment-5257335) here since it is more an argument againts this feature (and not specific to the question how to implement it):

-1 to having hidden plugins. This goes against the principle of auditability as it obscures the actions taken in the playbook from the reader.

I'm fine if you label the plugins 'for internal use only' in the documentation, but I don't think this should affect the runtime code in any way, neither hiding them from listing nor disallowing the use outside of the original package.

I understand the desire of some developers to ship things that are 'good enough' for their uses but not for public consumption, but in the interest of auditability and simplicity I strongly oppose such a feature.

felixfontein commented 1 year ago

Regarding the vote #204 about how should private modules/plugins be marked, I'm counting:

a) +1: 2.5 SC votes (1/2 felixfontein, gotmax23, ssbarnea) -1: 1 SC vote (acozine), 1 community vote (bcoca)

b) +1: 8 SC votes (felixfontein briantist mariolenz acozine markuman Andersson007 ssbarnea cidrblock), 4 community votes (maxamillion tremble kristianheljas leogallego) -1: 2 community votes (bcoca s-hertel)

c) +1: no vote -1: 1 SC vote (felixfontein), 1 community vote (bcoca)

d) +1: one community vote (s-hertel: INTERNAL in first line of description) -1: one community vote (bcoca)

Can someone from the steering committee confirm these votes?

Andersson007 commented 1 year ago

i confirm the count by @felixfontein

s-hertel commented 1 year ago

Not that it really matters, but for posterity, d) was the what core ended up recommending instead of https://github.com/ansible/ansible/pull/79370, and as my comment came after bcoca's encompassing -1 he wasn't voting for or against it.

The core team suggested b) instead of the original solution community chose (https://github.com/ansible/ansible/pull/79218) only because we were considering potential runtime changes. Otherwise, the request would have been a new key or something in DOCUMENTATION (we wanted to avoid _). Hijacking the runtime metadata for external purposes... not great.

nitzmahone commented 1 year ago

There's kind of a broader theme that I'm noticing more and more: people want to augment core runtime-owned structures for non-core-runtime things. It's happening with the collections packaging mechanism as a whole, with collection metadata, and several other things. As the de-facto ~person to blame~ creator of both the collection structure and later the collection runtime metadata, I'll freely admit that non-core extensibility was not a design factor for either of those things. Collections themselves were about safely externalizing, packaging, and distributing modules and core plugins, and the runtime collection metadata was mostly about providing an escape hatch to keep things working or provide useful breadcrumbs when things inevitably needed to be moved around. The tools that have grown organically over time around these things also mostly reflect those original design goals.

We're getting more requests around reusing collections to package and distribute content that's "foreign" to ansible-core, as well as things like this, where there's a desire to embed non-runtime metadata in critical runtime structures. Both issues are a double-edged sword for the core team; we don't want to say "this is ours, go build your own", but we have enough trouble maintaining and validating things internally with the proliferation of organically-grown tooling that's reliant on these structures (especially when they're tied to a particular core version, but need to validate content that spans core versions).

I've been working on ways to retcon non-core extensibility into the collection directory structure in a future-friendly way. We don't want a constant stream of new "things to ignore" backports for core sanity tests, but ideally we'd also like to define some conventions to allow the collection loader and ansible-doc to enumerate and document arbitrary "things" in collections that they don't have explicit support for or knowledge of, but that adhere to some conventions TBD.

If b) is the way folks want to go, it sounds like we need something similar for collection runtime metadata- define a simple set of conventions/patterns where core runtime and sanity tooling can correctly distinguish between "this is core runtime data that is likely malformed" vs "this is a thing we can safely ignore". The working idea for the collections directory structure is an ext/ directory at certain levels of the collection directory hierarchy whose contents would be completely ignored by core sanity tooling and runtime (so basically it's total anarchy under there from core's perspective). A couple ideas off the top of my head:

1) meta/ext/non_runtime.yml :laughing: 2) anything under an ext: key at specific (or any?) levels of meta/runtime.yml is ignored by core schemas and sanity tests. I hate to say it, but this one would be a lot easier in XML (since XML namespaces basically give you an infinite number of parallel data planes to smuggle metadata around in hierarchical data)- sadly YAML has no analogous concept, and I'm pretty sure I'd be getting death threats (from my own team, no less) if I propose runtime.xml :wink:

Thoughts?

felixfontein commented 1 year ago

Hmm, having that extra data not in meta/runtime.yml is a good idea to avoid that file growing a lot and making ansible-core slower (which would speak against using meta/runtime.xml where everyone can put their own crap in namespaces). The meta/ext/ idea sounds good, though I'd like to have one standardized file for extra plugin/role/... metadata that has at least minimal linting (down to a certain level), like making sure that only supported plugin types are mentioned in it.

How about having a file meta/extra.yml (or meta/ext.yml, since ext can also stand for extended/extension/...) with a structure like:

plugin_information:
  module:
    foo: {...}
    bar: {...}
  lookup:
    baz: {...}

where every object can have namespaced keys (and that's the part I would like a linter to enforce), like every key having a prefix, followed by some defined symbol like :, and then some arbitrary name. That would allow users to further lint the parts they are interested in. For example community could define that currently only a key community:private is allowed (of type boolean), and a community linter could check that every key starting with community: follows this pattern. The networking team might use the network: prefix to define some things, and ACME Inc. would use the acme: prefix for something specific they want to store.

(Yeah, XML handles this more nicely, since : has another meaning in YAML, but I'm really happy we don't use XML as well :D )

felixfontein commented 1 year ago

@bcoca

-1 to having hidden plugins. This goes against the principle of auditability as it obscures the actions taken in the playbook from the reader.

Either I understand the word auditability differently, or I don't really understand what's the problem here. The documentation of these 'hidden' plugins are still available, it just would require a little bit of knowledge to find them (with the proposed extra ansible-doc switch to list all plugins), and directly calling ansible-doc or using the docsite URL for the plugin also works. So it is very easy to find information about these plugins.

I'm fine if you label the plugins 'for internal use only' in the documentation, but I don't think this should affect the runtime code in any way, neither hiding them from listing nor disallowing the use outside of the original package.

I disagree here, showing them in the default listing is IMO a really bad idea, but it should be easy to still list them (by an extra flag like --show-private-plugins). I don't see how this makes it harder to find out what really goes on.

I also diagree that preventing (or making it harder for) these plugins to be used outside the collection they are defined in is making it harder to audit a collection / some playbook. Especially this makes it easier, since you cannot use anything internal outside the collection. Or do you mean auditability of a collection itself? (In that case I don't understand how this is a problem either, since you are already looking at a collection's source and you can see that these plugins are there.)

bcoca commented 1 year ago

Clearly we have different dictionaries, by definition anything you do to hide part of the data to be audited goes against auditability. Imagine your tax auditor finds a few accounts in your ledger w/o visible description or information and you respond "don't worry, its in another set of books, we hide these accounts since they are internal and should not be readily seen".

We will continue to disagree at any point that this makes core treat the plugins any differently.

mariolenz commented 4 months ago

@felixfontein Please close this issue if done, or open a new forum topic and then close the issue with a pointer to the new discussion: Community-topics: Archiving the repo

felixfontein commented 4 months ago

I've created https://forum.ansible.com/t/how-to-mark-private-plugins-or-modules-or-roles-in-a-collection-how-to-store-extra-non-core-metadata-on-modules-plugins-roles-in-collections/5736 to continue discussing this.