ansible-community / community-topics

[Moved to Ansible Forum] Discussions for Ansible Community Meetings
https://docs.ansible.com/ansible/devel/community/steering/community_steering_committee.html#community-topics-triage
GNU General Public License v3.0
35 stars 9 forks source link

Collection requirements: expanding the section on best practices #33

Closed tadeboro closed 2 years ago

tadeboro commented 2 years ago

Summary

Right now, the development conventions section in collection requirements use general best practices as its base. And while those general rules do offer a good baseline, parts of the Ansible ecosystem developed their own set of best practices when developing Ansible content.

For example, resource modules have their own set of conventions that contradict the general best practices in certain places.

Maybe we should expand the best practices section and add some specialized subsections for things like network collections?

tadeboro commented 2 years ago

Related discussion: https://github.com/ansible-collections/ansible-inclusion/discussions/27#discussioncomment-1037793

Andersson007 commented 2 years ago

Sounds good to me

tadeboro commented 2 years ago

Summary of the discussion (full log at https://meetbot.fedoraproject.org/ansible-community/2021-07-28/ansible_community_meeting.2021-07-28-18.00.log.html#l-152):

  1. Historically (when all modules lived in ansible/ansible), get, gather, etc. were seen as actions, not as a state, and were thus pushed into separate info and facts modules.
  2. Networking-related content was developed in relative isolation from the rest of the Ansible content and formed its own best practices.
  3. Resource modules combine the action execution (gathering existing state, parsing native representation) and state enforcement (merged, replaced, ...) into one module in order to keep all functionality related to some networking concept in one place.

Open questions:

  1. Are we OK with existing network-related modules to keep using their own best practices?
  2. Should we encourage new content to adopt concepts behind resource modules or should we advise developers to follow standard state enforcement + info module split?
cidrblock commented 2 years ago

Proposal follows, document the resource module pattern as an alternative with well-defined behavior.

I would also like a number of other people to review this for accuracy and comment before it is finalized.

Overview

Note: The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

Note: The use of the "module" below customarily refers to either a module or action plugin.

The resource module pattern is an alternative approach to developing ansible modules. As of 8/11/2021, this pattern is most commonly used with network and security modules.

What is a resource module

  1. Resource modules are limited to configuration management
  2. A resource module manage the configuration of one or more resources of the same type on the target
  3. Resource modules guarantee idempotent operations
  4. Resource modules guarantee the request and the response will have a common and consistent structure
  5. Resource modules MUST make the minimum number of configuration changes necessary to achieve the desired configuration state

Resource module pros

  1. Ansible users can have confidence the data returned at the completion of the module will always be valid for subsequent invocations of the same module.
  2. Ansible users have the ability to use a single module to retrieve the current configuration of the target as well as configure the target
  3. A module identified as a resource module, will have a consistent set of state values (see below)
  4. The behavior of the all resource modules will be consistent with regard to state
  5. Ansible resource modules can be used to achieve a desired configuration state
  6. Resource modules encourage the creation and maintenance of off-box sources of truth for configuration information
  7. Resource module allow for configuration roll back and restoration withing the same play or work flow using the before payload of the return value

Resource module cons

  1. Resource modules do not provide operational state information
  2. Resource modules can be more difficult to develop
  3. An underlying REST API on the target may need to be complimented with additional logic within the resource module's code to achieve the resource module pattern
  4. Resource modules do not provide granular deletion of a resource entries attributes. This would instead be accomplished using replaced
  5. All required functionality and behaviors below MUST be implemented to follow the resource module pattern.
  6. Developers MUST NOT implement only a subset of the required functionality or behavior of a resource module
  7. The data provided to and returned from resource modules can be complex and nested

Resource module states

  1. A resource module MUST implement the merged state, which merges provided configuration in the task with the target system configuration. No target configuration should be removed.

  2. A resource module MUST implement the replaced state, which replaces the configuration of one or more resource entries on the target with the resource configurations provided. This requires a unique identifier for the resource type being configured.

  3. A resource module MUST implement the deleted state, which remove the configuration of all resource entries on the target. Per resource entry, granular attribute deletion is not provided by resource modules.

  4. A resource module MUST implement the gathered state, which return the current configuration to the user in the gathered key of the response.

  5. A resource module SHOULD implement the overridden state which both replaces the configuration of resource entries provided in the task and removes the configuration of resource entries not provided in the task. The overriden state guarantees desired configuration state as well as mechanism to detect target system configuration drift.

  6. A resource module MAY implement the purged state if the configuration of a resource is spread across multiple resource module. The purged state would typically only be implemented by the parent module when the multiple modules have a parent and child relationship

  7. Resource modules MAY implement the rendered and parsed states which allow for off-box, bidirectional conversion between the target's native configuration and the structured data of the resource module. This is most commonly found for targets which provide both a command line interface and file based configuration.

Resource module check mode

  1. All resource modules MUST implement check mode, during which time the target systems configuration can be queried but MUST NOT be changed.

Resource module return values

  1. Resource modules MUST provide the before and after configuration as structured data in the return value in these keys

  2. A resource module SHOULD provide the commands issued or proposed to achieve the desired configuration state if the target system is being configured with a command line interface. The commands will be provided as a list in the return value's commands key.

Ansible facts subsystem integration

  1. Resource modules can OPTIONALLY integrate with the ansible facts and setup subsystem as an alternative method for ansible users to retrieve the current configuration of the target

Resource modules within a collection

  1. If a platform or system specific ansible collection include multiple modules for the purpose of configuration management, all or none of the modules SHOULD follow the resource module pattern

Resource module scope

  1. A resource module MUST have a well defined and documented configuration scope
  2. A resource module MUST NOT make configuration changes outside its defined scope

Resource module tests

  1. The integration tests for a resource module MUST include tests for each state
  2. The integration tests for a resource module MUST demonstrate the ability to go round trip (configure -> gather -> configure) in an idempotent manner

Resource module development

  1. Resource modules MUST pass the return value (both before and after) through argument specification validation before exiting, this ensures the return value shape is complete and compatible with the modules argument specification
justjais commented 2 years ago

I agree with each of the points mentioned by @cidrblock for resource module supported states and its ease to use and sync in functionality when it comes to network and security platform configuration. It's viable for any other content which supports or exposes REST APIs for their configuration.

rohitthakur2590 commented 2 years ago

@cidrblock This explains everything about resource modules very well. Just a minor point about gathered output mentioned above

A resource module MUST implement the gathered state, which return the current configuration to the user in the before key of the response.

so the current configuration is returned in the gathered key of the response instead of before

ganeshrn commented 2 years ago

Should we encourage new content to adopt concepts behind resource modules or should we advise developers to follow standard state enforcement + info module split?

Not sure what makes the state enforcement + info module spilt a "standard" pattern?

In the case of state enforcement modules, the current state of configuration for a resource is fetched (for idempotent check) but it is not returned in the response. IIUC the info modules return both configuration and operational data for the given resource.

The response returned from the resource module is only the configuration data for that resource and is structured in module argpsec format. The advantage of doing this is users can run a single module to get the current state of configuration, store it in structured format as inventory variables. This in turn allows storing the config state of the entire inventory in structured key-value pairs in YAML/JSON format which can then be easily compared against a desired state/golden config/source of truth (SOT) without requiring any data transformation and feed the structured data with different values (if required) to the same resource module.

To get the same functionality with the info module approach it will be required 1) To run the info module 2) Retrieve the config parameters from the response and map them to the options of the corresponding config module 4) Compare it with SOT 5) Invoke the config module if required.

The data transformation part in the playbook is not that easy and this design pattern adds to more complexity from the playbook author's perspective IMO.

tadeboro commented 2 years ago

Not sure what makes the state enforcement + info module spilt a "standard" pattern?

https://github.com/ansible/ansible/blob/devel/docs/docsite/rst/dev_guide/developing_modules_best_practices.rst#scoping-your-modules

The bullet point about not adding list and info states to existing modules is there for almost three years now and most of the non-networking modules follow that advice.

In the case of state enforcement modules, the current state of configuration for a resource is fetched (for idempotent check) but it is not returned in the response. IIUC the info modules return both configuration and operational data for the given resource.

This is left to the implementation. But a lot of "standard" modules that interact with an API to configure resources also return back the state in the result. So I would say that the common practice is actually:

  1. Regular modules download the current state, perform change detection, perform configuration updates if required, and then return the final state of the resource.
  2. Info modules only download the current state of the resource (or multiple resources if the API is structured that way).

Samples of such module pairs can be found in all collections I helped develop (Sensu Go, ServiceNow, NGINX Unit, AWS), but point 1 holds for an even greater number of modules (there are very few modules that do not return anything). And yes, the output of the regular and info modules in such cases is usually compatible: the structure of resources is the same, but the regular module returns back a single resource while the info module usually returns back a list of resources.

The response returned from the resource module is only the configuration data for that resource and is structured in module argpsec format.

There is nothing preventing you from implementing this in a standard module + info pair.

The advantage of doing this is users can run a single module to get the current state of configuration, store it in structured format as inventory variables. This in turn allows storing the config state of the entire inventory in structured key-value pairs in YAML/JSON format which can then be easily compared against a desired state/golden config/source of truth (SOT) without requiring any data transformation and feed the structured data with different values (if required) to the same resource module.

To get the same functionality with the info module approach it will be required

1. To run the info module
2. Retrieve the config parameters from the response  and map them to the options of the corresponding config module
3. Compare it with SOT
4. Invoke the config module if required.

I may be misunderstanding this, but as far as I can tell, there is no difference between a resource module and a similarly designed standard module pair. Steps 1, 3, and 4 are required even when we use resource modules. Step 1 uses state=gathered, step 3 is simple because the result and SOT have the same structure, and in step 4 we run the module again with state=replaced.

If the regular modules are designed in such a way that they return data in a form that is compatible with the SOT, we need exactly the same three steps: in step one we would run an info module, compare the data in step 3, and update things in step 4 with the state-enforcing module.

But I would argue that doing things this way is very non-ansible-like. If we want to enforce the state, there is no need to compare the existing state first since state-enforcing modules should do this for us. Making sure things do not deviate from the SOT should be done using exactly the same task, but in check mode (and possibly with the --diff modifier so that we can see what drifted). This is actually what we advise to our customers: to make sure they use high-quality Ansible content so that they can use the same playbook to enforce the desired state and detect configuration drift.

The data transformation part in the playbook is not that easy and this design pattern adds to more complexity from the playbook author's perspective IMO.

Again, this has nothing to do with the fact that resource modules combine the state and action parts in the same module. As demonstrated, any reasonably designed module can do that.

I would argue that the resource module should actually be a "toolbox" of things that can help users manage certain resources. This toolbox would include:

  1. A state-enforcing module that can have its state parameter set to merged, overriden, etc.
  2. An info module for fetching the current state.
  3. A pair of filters for transforming data between its structured and native format.

So in my opinion, the idea behind the resource modules is sound, but the implementation is not something I would encourage people to adopt outside the networking ecosystem.

ganeshrn commented 2 years ago

https://github.com/ansible/ansible/blob/devel/docs/docsite/rst/dev_guide/developing_modules_best_practices.rst#scoping-your-modules

The bullet point about not adding list and info states to existing modules is there for almost three years now and most of the non-networking modules follow that advice.
There is nothing preventing you from implementing this in a standard module + info pair.

Ansible content modules including built-in, networking, security, windows and so on don't follow module + info pair so I still wouldn't call it a standard pattern. The doc you referred to doesn't clearly define the module pattern for all the major endpoints that Ansible can manage and it requires updates IMO.

I would argue that the resource module should actually be a "toolbox" of things that can help users manage certain resources. This toolbox would include:

A state-enforcing module that can have its state parameter set to merged, overriden, etc. An info module for fetching the current state. A pair of filters for transforming data between its structured and native format.

A state can consist of the operational and configuration data of the managed host. Resource modules only fetch the configuration state of the host that it needs to take action on and not the operational state. The info module returns both operational and configuration states. So in cases where users have to manage the config state only with resource module it can be done with single module whereas as module + info split requires using two modules.

So in my opinion, the idea behind the resource modules is sound, but the implementation is not something I would encourage people to adopt outside the networking ecosystem.

There is a bunch of security content that is written with the resource module pattern and also there is scope for cloud modules (especially networking modules) that can adopt resource module patterns so IMO it is not limited to networking only.

Finally, I would like to add resource modules are around for 2+ years now and are well received by Ansible users, partners and customers.

Here's recent feedback that we received from a community user

wayt
team - just want to say the resource modules are very well received by everyone i've shown them and used them with
good to see some doc's around creating them as well. i'll need to review those

Since there is a bunch of Ansible content that adheres to resource module and module + info split pattern IMO both the patterns can be added as recommended one and let the collection developer use the best judgement to pick the one as required.

justjais commented 2 years ago

I'm in agreement with @ganeshrn comment and as mentioned in my earlier comment as well, resource modules and the implementation of its state is not confined to networking content, it's helpful for all of the contents where configuration can be made via REST APIs which includes SECURITY, CLOUD, and VMware content, as the resource module states (i.e. MERGED, REPLACED, OVERRIDDEN, DELETED) very well replicate and resonates the operational state what REST API exposes (i.e. GET, POST, PUT, PATCH, DELETE).

Also, keeping two modules one managing configuration and the other (i.e. info modules) getting information only increases the number of modules (2X) and also burdens the user to specifically call the info modules to get the operational state of the configured device.

This can be very well observed especially in security, cloud content space where there's ~100-500+ APIs to configure the box and if the content has module + info_modules ultimately the specific content will start to have 2x number of modules which is cumbersome for both the module creator and maintainer and as well as the end user.

ganeshrn commented 2 years ago

. Networking-related content was developed in relative isolation from the rest of the Ansible content and formed its own best practices.

All the proposals related to the resource module are in the public domain and the content was never intended to be developed in isolation. There might be a feeling of isolation because of the time-zone centric process currently followed. The meeting where the community decisions are made is scheduled at 12.30 AM IST which is not suitable for all the contributors. BTW there is a good number of active Ansible content contributors (and growing) currently working in IST. So I think it will help going forward if the process is made more time-zone agnostic and asynchronous.

felixfontein commented 2 years ago

https://github.com/ansible/ansible/blob/devel/docs/docsite/rst/dev_guide/developing_modules_best_practices.rst#scoping-your-modules

The bullet point about not adding list and info states to existing modules is there for almost three years now and most of the non-networking modules follow that advice.
There is nothing preventing you from implementing this in a standard module + info pair.

Ansible content modules including built-in, networking, security, windows and so on don't follow module + info pair so I still wouldn't call it a standard pattern.

That querying information is split from updating state to _info or _facts modules (depending on what this information is about) is a standard pattern. It has been followed by all non-network modules in ansible/ansible (I'm not really familiar with almost all network modules, so I cannot comment on them), and is still being followed by all builtin modules and many other modules (for example, everything in community.general, amazon.aws, community.aws, and many of the smaller community collections that were split off from ansible/ansible or community.general).

I would argue that the resource module should actually be a "toolbox" of things that can help users manage certain resources. This toolbox would include:

A state-enforcing module that can have its state parameter set to merged, overriden, etc. An info module for fetching the current state. A pair of filters for transforming data between its structured and native format.

A state can consist of the operational and configuration data of the managed host. Resource modules only fetch the configuration state of the host that it needs to take action on and not the operational state. The info module returns both operational and configuration states.

There is no rule that _info modules must do that. That's often what _info modules do, but they could also just return configuration state and not operational state, or separate them clearly.

felixfontein commented 2 years ago

There might be a feeling of isolation because of the time-zone centric process currently followed. The meeting where the community decisions are made is scheduled at 12.30 AM IST which is not suitable for all the contributors. BTW there is a good number of active Ansible content contributors (and growing) currently working in IST. So I think it will help going forward if the process is made more time-zone agnostic and asynchronous.

While this is a very valid point in general, I don't think it really applies to why resource modules behave differently from "everything else". Both resource modules and the standard for Ansible modules - information/facts gathering must be part of _info/_facts modules and not of regular modules - has been around long before the community meeting was created.

As I understood it, the core team always embraced - and everything contained in ansible/ansible, except possibly network stuff, always conformed to - that information gathering is done in _facts or _info modules, and not by regular modules. (There are/were some legacy exception, like state=get in the route53 module, but they are getting deprecated and removed eventually - if they haven't already.)

ganeshrn commented 2 years ago

As I understood it, the core team always embraced - and everything contained in ansible/ansible, except possibly network stuff, always conformed to - that information gathering is done in _facts or _info modules, and not by regular modules.

The current network modules gather information using the fact module itself. The information consist of both operational state and configuration state with the added constraint that the configuration data returned should follow the same structure as that of the corresponding resource module. A single info/fact module that can fetch operational/configuration state based on input options is more optimised as compared to having two separate modules for a single resource one for pushing configuration and the other for fetching configuration as one is a subset of the another.

Most config modules (even core) get the current (config) state for which it has to take action, the only additional part in the resource module is the return response that has before and after keys and the value is in the resource argspec format. For resource module states merged, replaced, overridden, purged and deleted are all core actionable states which are in line with your comment. There are additional states like parsed and rendered which doesn't have any action but are added for ease of usage and those are optional.

I don't think it really applies to why resource modules behave differently from "everything else"

I don't agree with your comment for the reasons I mentioned above. Resource modules are doing what regular modules do with some added functionality.

tadeboro commented 2 years ago

Since this issue has been "stuck" in limbo for two months and is blocking the inclusion/rejection of the trendmicro.deepsec Ansible Collection, I propose we answer the following question:

Do we want to accept resource modules (where resource modules are defined as in https://github.com/ansible-community/community-topics/issues/33#issuecomment-897029700) as one of the standard ways for structuring and writing Ansible modules that want to become part of the community package?

If the answer is yes, we can formally accept the definition of what the resource modules are and we are done.

If the answer is no, we should answer the following question:

Do we want to accept resource modules into the community package?

The answers to this question I came up with/stole from the meeting discussions are:

  1. Yes.
  2. Yes, but only for network modules where the pattern is already established.
  3. Yes, but only for network and security modules, because the pattern seems to fit the use cases well there.
  4. No.

At this point, we should know if resource modules can be part of the Ansible package or not. This should unblock the current inclusion decisions (and possibly open the discussion about what to do with existing resource modules that were grandfathered into the Ansible package).

@abadger @felixfontein @gundalow @Andersson007 @acozine @ssbarnea @jillr @cidrblock @jamescassell @thaumos Can you please comment on this at your earliest convenience? Thank you all in advance!

justjais commented 2 years ago

@tadeboro Thanks for creating the proposal, but would like to extend on the options as:

  1. Yes.
  2. Yes, but only for network modules where the pattern is already established.
  3. Yes, for network and security modules, or any content which supports REST API-based configuration because the pattern seems to fit the use cases well there.
  4. No.

My vote: Option 1

As pointed in my earlier comment, the use of resource module pattern is and should not be limited to network/security content and should be available for ALL of the content where REST API-based configuration is supported, which includes Cloud, Container, Kubernetes, and VMware as well. We should not enforce the resource module approach, but community and module creators should have the option for both to go either via present/absent and info/facts module way or the resource module way (with newer states along with gathered) which IMO is the better way for REST API-based contents.

cidrblock commented 2 years ago

My biggest concern is that the definition of how a resource module behaves and the features it offers is both well defined and followed for plugins that claim to be a resource module. Regardless of where the pattern is used, we need to ensure the contract between the developer and user that the resource module pattern guarantees is upheld.

I believe it is the developers choice to follow and align with the resource module pattern where they believe it is a good fit and the pattern should be considered an alternative to the current best practices.

My vote: Option 1, allow developers to choice the pattern for their content and do not deny inclusion into the Ansible package because a developer believes the resource module pattern provides the best experience for the users of their content.

IPvSean commented 2 years ago

copy and pasting from another thread https://github.com/ansible-collections/ansible-inclusion/discussions/27#discussioncomment-1362284

Who counts as community and who doesn't? Was there a poll/vote or was it just people in a particular IRC chatroom at one given time? What time zones were available? I also don't think community users are the only stakeholder in conversations like this. Does my opinion not count because I was not in a particular IRC chatroom at a given time?

Resource modules are a natural evolution of Ansible to allow both the best parts of imperative and declarative models. They are insanely helpful for network vendors, and help onboard people with more traditional IT backgrounds that might not understand complex templating.

NilashishC commented 2 years ago

Echoing all the points that @cidrblock @justjais @ganeshrn has mentioned in this thread. I vote for Option 1. Let the content developer decide which pattern to choose and NOT exclude modules that follow the Resource Module approach.

tadeboro commented 2 years ago

@justjais ~@cidrblock~ @NilashishC Just to make sure: since you are voting for the first option of the second question, is it safe to assume that you DO NOT want to promote resource modules as away of writing modules?

Edit: Sorry Brad, I pinged you by mistake here.

cidrblock commented 2 years ago

@tadeboro NP, I think everyone probably knows where I stand on this issue :)

ganeshrn commented 2 years ago

templating.

copy and pasting from another thread ansible-collections/ansible-inclusion#27 (comment)

Who counts as community and who doesn't? Was there a poll/vote or was it just people in a particular IRC chatroom at one given time? What time zones were available? I also don't think community users are the only stakeholder in conversations like this. Does my opinion not count because I was not in a particular IRC chatroom at a given time?

Resource modules are a natural evolution of Ansible to allow both the best parts of imperative and declarative models. They are insanely helpful for network vendors, and help onboard people with more traditional IT backgrounds that might not understand complex templating.

@IPvSean +100

ganeshrn commented 2 years ago

Since this issue has been "stuck" in limbo for two months and is blocking the inclusion/rejection of the trendmicro.deepsec Ansible Collection, I propose we answer the following question:

Do we want to accept resource modules (where resource modules are defined as in #33 (comment)) as one of the standard ways for structuring and writing Ansible modules that want to become part of the community package?

If the answer is yes, we can formally accept the definition of what the resource modules are and we are done.

If the answer is no, we should answer the following question:

Do we want to accept resource modules into the community package?

The answers to this question I came up with/stole from the meeting discussions are:

  1. Yes.
  2. Yes, but only for network modules where the pattern is already established.
  3. Yes, but only for network and security modules, because the pattern seems to fit the use cases well there.
  4. No.

At this point, we should know if resource modules can be part of the Ansible package or not. This should unblock the current inclusion decisions (and possibly open the discussion about what to do with existing resource modules that were grandfathered into the Ansible package).

@abadger @felixfontein @gundalow @Andersson007 @acozine @ssbarnea @jillr @cidrblock @jamescassell @thaumos Can you please comment on this at your earliest convenience? Thank you all in advance!

My vote is Option 1 :-)

Andersson007 commented 2 years ago

As the concept is already in use in many places and seems to be defined well here, my vote is Yes, we should accept it as a separate concept as a whole.

In other words, there will be 2 types of conventions: 1) the general dev conventions are in power by default 2) resource modules conventions for resource modules with their own set of rules that, when contradicting the general conventions, must not be accepted in other kinds of modules. The border, IMO, is well defined now and the use case seems to be, if i understand correctly, specific to network devices and other things supporting REST API-based configuration.

As a summary, i vote Yes because:

We should highlight in the requirements at least that:

Edited: Also it doesn't feel fair to restrict the areas because if we accept the concept for network and security areas, this, imo, should be accepted everywhere where it fits.

NilashishC commented 2 years ago

@justjais @cidrblock @NilashishC Just to make sure: since you are voting for the first option of the second question, is it safe to assume that you DO NOT want to promote resource modules as away of writing modules?

@tadeboro sorry for the confusion. My vote for the first question is Yes. :)

jillr commented 2 years ago

Thanks very much @cidrblock for the proposal write up; that was immensely helpful to me.

I'm +1 to accepting the resource module pattern as written here and +1 to accepting collections for any content domain (network, security, or other) that adhere to that spec.

sivel commented 2 years ago

-1 for option 1 -1 for option 3 if it expands outside of network or security +1 for option 2 or option 4

matburt commented 2 years ago

For AWX, Tower, and other Ansible product initiatives, we'd very much like to see Option 1 where the resource module pattern can be applied more generally within the bounds of the resource module proposal.

Having content being able to generally adhere to this pattern enables capabilities and systems that we'd like like to design and build without being pigeonholed into specific classes of automation.

tadeboro commented 2 years ago

I think I expressed my preference a few times already, but just for the sake of completeness, here are my "votes":

  1. No to adopting resources modules in the current form as another way of writing ansible modules (I think the pattern mixes regular and info modules + filters for no good reason).
  2. I would only allow network and security modules to use that pattern and still be accepted to the community package (option 3) or no such modules at all (option 4).
felixfontein commented 2 years ago
  1. No to adopting resources modues in the current form as a way to write Ansible modules in all areas. I also think it's better to stick to the existing convention to having _info for gathering, and filters / lookups for transformations.
  2. I'm OK with allowing resource modules in network and security collections, since they're already used widely there, but I really want to avoid them bleeding over into other areas. So option 3 for me.

I also don't think 3. should cover any other areas, even if they have REST API-based configuration. I don't think resource modules are a natural choice for REST API-based configuration. The classic regular module + _info module approach works very well for such APIs, too.

IPvSean commented 2 years ago

For the question:

Do we want to accept resource modules (where resource modules are defined as in #33 (comment)) as one of the standard ways for structuring and writing Ansible modules that want to become part of the community package?

with the four options->

  1. Yes.
  2. Yes, but only for network modules where the pattern is already established.
  3. Yes, for network and security modules, or any content which supports REST API-based configuration because the pattern seems to fit the use cases well there.
  4. No.

I vote option 1.

Reasoning: I talk to hundreds of customers, community members, potential customers, network automators, system administrators, etc from a different skill-sets, different domains (e.g. heavy linux users, people who are NOT network engineers) and we have tons of folks asking us "can you do this for other use-cases besides network?" Literally the network use-case is becoming more mature than other use-cases because of resource modules. People want turn-key declarative models for configuring things because not everyone can spit out a 50 line jinja2 template from memory.

Between January 1st, 2021 and now, we have taught over 14,000 students (probably much more, I lost some data when we did a migration from DNS names for my collection tool). Our SA organization routinely meets with non-typical users (people who are learning ansible for the first time with an instructor, or first class they ever attend).

Comments like this->

regular and info modules + filters for no good reason

this seems like a developer opinion versus a user opinion. This exact use-case (the gathered parameter) did not exist originally with resource modules, but after meeting with countless customers, potential customers, community users, the network engineering team and @cidrblock added the gathered resource. It just made sense.

In fact I would argue that this particular feature that you brought up with that comment, was developed BECAUSE of the community, we would be literally ignoring our community if we argue against this paradigm.

The argument made here (splitting modules from info and config) would be advocating for multiple modules, which means multiple tasks, to do simple retrieval and configuration, even though they have identical data models.... it is not a great user experience and non-intuitive to a novice.

felixfontein commented 2 years ago

People want turn-key declarative models for configuring things because not everyone can spit out a 50 line jinja2 template from memory.

Sorry, but that is totally unrelated to whether resource modules should be allowed or not. Resource modules are one way to write modules, but one can easily write modules (pairs of modules, if you want) using very similar principals that do not require such jinja2 templating.

Between January 1st, 2021 and now, we have taught over 14,000 students (probably much more, I lost some data when we did a migration from DNS names for my collection tool). Our SA organization routinely meets with non-typical users (people who are learning ansible for the first time with an instructor, or first class they ever attend).

Comments like this->

regular and info modules + filters for no good reason

this seems like a developer opinion versus a user opinion. This exact use-case (the gathered parameter) did not exist originally with resource modules, but after meeting with countless customers, potential customers, community users, the network engineering team and @cidrblock added the gathered resource. It just made sense.

Sorry, but I still don't see how it makes sense. Can you give a concrete example where resource modules allow to do something more elegantly than regular module + _info module pairs?

In fact I would argue that this particular feature that you brought up with that comment, was developed BECAUSE of the community, we would be literally ignoring our community if we argue against this paradigm.

The argument made here (splitting modules from info and config) would be advocating for multiple modules, which means multiple tasks, to do simple retrieval and configuration, even though they have identical data models.... it is not a great user experience and non-intuitive to a novice.

You still need two tasks if you want to gather configuration in one task, and apply configuration changes in another task. There is no difference here between resource modules and module/info module pairs.

abadger commented 2 years ago

I would add (5) Okay for current resource modules but new ones need to conform to a resource modules v2 spec where the test and filter overlap is in appropriate plugins [and possibly also the _info/_facts module split].

I'm also going to assume that (4) includes grandfathering in any modules which currently are in the ansible tarball and follow the resource module pattern.

I am -1 on (1),

Ranked choice of the others would be: (5) > (4) > (2) > (3).

If (3) is chosen, I would only vote +1 if there was a good definition of network and security modules. For instance, I would want to avoid defining a module that configures selinux as a security module.

IPvSean commented 2 years ago

Sorry, but that is totally unrelated to whether resource modules should be allowed or not. Resource modules are one way to write modules, but one can easily write modules (pairs of modules, if you want) using very similar principals that do not require such jinja2 templating.

Consider you are modifying an httpd.conf, apache configuration file, a nginx.conf, etc. There are identical to a Cisco IOS configuration file, or a Junos, or an Arista.... we keep pretending network is dissimilar from server community but its not at all. How are you modifying any flat-file on any type of device?

Can you write playbooks without complicated Jinja2? Yes, can you avoid it for a lot of these use-case mentioned. No... just browse stack overflow and see the blockinfile, lineinfile and jinja2 configuration methods....

Sorry, but I still don't see how it makes sense. Can you give a concrete example where resource modules allow to do something more elegantly than regular module + _info module pairs?

So that is what we had... and we showed this to users, who were confused why you would create two separate entries for the same resource. Why would I use a separate module to gather info for X resource.... it was confusing from our user experience. Again this is based off user feedback.

You still need two tasks if you want to gather configuration in one task, and apply configuration changes in another task. There is no difference here between resource modules and module/info module pairs.

This is a developer mindset versus a user. There is a lot of Ansible developers on this thread, but not a lot of users, trainers, etc which are just as important to the community. The users are confused WHY you would use a separate module to gather the exact same info from a resource. Just because you CAN do something doesn't mean its intuitive for novice users.

Folks are thinking we make these decisions in a vacuum and that is not the case, we meet with field organizations weekly across multiple geography units, multiple domains, etc. We are literally being asked over and over "why can't we do this for use-case X (which is not security or network automation)"

felixfontein commented 2 years ago

Sorry, but that is totally unrelated to whether resource modules should be allowed or not. Resource modules are one way to write modules, but one can easily write modules (pairs of modules, if you want) using very similar principals that do not require such jinja2 templating.

Consider you are modifying an httpd.conf, apache configuration file, a nginx.conf, etc. There are identical to a Cisco IOS configuration file, or a Junos, or an Arista.... we keep pretending network is dissimilar from server community but its not at all. How are you modifying any flat-file on any type of device?

I'm not really sure what you are trying to get at. There are always different ways to configure things, no matter on what kind of device or for what kind of service. There is nothing stopping you from creating a module (pair) to configure httpd.conf, nginx.conf, or anything else in the same way as a network device using regular modules and _info modules. Just because web servers are commonly configured by templating config files doesn't mean you could have a module which allows to configure a Apache virtual server, and to query its configuration in the exact same format as you can put into the configuration module.

Can you write playbooks without complicated Jinja2? Yes, can you avoid it for a lot of these use-case mentioned. No... just browse stack overflow and see the blockinfile, lineinfile and jinja2 configuration methods....

How is that related to resource modules vs. module + _info module pairs?

Sorry, but I still don't see how it makes sense. Can you give a concrete example where resource modules allow to do something more elegantly than regular module + _info module pairs?

So that is what we had... and we showed this to users, who were confused why you would create two separate entries for the same resource. Why would I use a separate module to gather info for X resource.... it was confusing from our user experience. Again this is based off user feedback.

I again ask you: can you give a concrete example? You need the exact same number of tasks for operations with resource modules than with regular modules (if these are written as module + _info module pair using similar principles than resource modules) -- except in the one case where you template the state parameter (there you'd need a lot more templating to achieve the same thing with module + _info module pairs). But I don't think you are aiming at this very special case.

cidrblock commented 2 years ago

Sorry, but I still don't see how it makes sense. Can you give a concrete example where resource modules allow to do something more elegantly than regular module + _info module pairs?

community general has a github_webhook_info module and a github_webook module as well.

  1. There is no indication in the documentation that output of the info module can be passed to the configuration module.
  2. Since there is no guidance or requirements for info or fact modules with regard to operational or configuration data, or documentation requirement that indicates a compatible module exists for configuration the burden is on the user to find and test the two together.
  3. Since they are separate plugins, there is not guarantee that even if compatible today, they will remain compatible in the future.
  4. The configuration module support a state of absent, which would require use of the info module to retrieve the current webhooks, diff the current vs. desired and use the configuration module to remove superfluous webhooks. This may or may not involve data transformation in either a jinja template or custom filter plugin.

If the above had been developed following the resource module pattern, the ambiguity and need for data transformation would not exist.

(editing for formatting)

abenokraitis commented 2 years ago

My vote: Option 1, allow developers to choose the pattern for their content and do not deny inclusion into the Ansible package because a developer believes the resource module pattern provides the best experience for the users of their content. (thanks @cidrblock )

This will futureproof any new methods and show inclusiveness as an open source project. Isn't the reason why we went to Collections was to lower the barrier to entry for contributors? Sounds like the project is slowly reverting back to being a bottleneck for innovation...

felixfontein commented 2 years ago

Sorry, but I still don't see how it makes sense. Can you give a concrete example where resource modules allow to do something more elegantly than regular module + _info module pairs?

community general has a github_webhook_info module and a github_webook module as well.

You can always pick out specific module + _info pairs which do not follow similar principles than resource modules. That does not prove anything, and is the main reason I always add the additional qualification "with similar principles than resource modules".

1. There is no indication in the documentation that output of the info module can be passed to the configuration module.

For many existing module + _info module pairs this is not possible (and often was never intended to be possible). But as I said, "with similar principles than resource modules".

2. Since there is no guidance or requirements for info or fact modules with regard to operational or configuration data, or documentation requirement that indicates a compatible module exists for configuration the burden is on the user to find and test the two together.

Exactly. But nobody stops you from creating module + _info module pairs which satisfy this.

3. Since they are separate plugins, there is not guarantee that even if compatible today, they will remain compatible in the future.

That is also not guaranteed for resource modules. You can have a promise, as with resource modules, but that promise you can also give for specific module + _info module pairs.

4. The configuration module support a state of absent, which would require use of the info module to retrieve the current webhooks, diff the current vs. desired and use the configuration module to remove superfluous webhooks.  This may or may not involve data transformation in either a jinja template or custom filter plugin.

For many existing module + _info module pairs this is not possible (and often was never intended to be possible). But as I said, "with similar principles than resource modules".

If the above had been developed following the resource module pattern, the ambiguity and need for data transformation would not exists.

Yes, but as I said (I'm really repeating myself): this has nothing to do with resource modules per se. You can define very similar guidelines for module + _info module pairs, and uphold that with the same rigor as you uphold these properties for resource modules.

felixfontein commented 2 years ago

This will futureproof any new methods and show inclusiveness as an open source project. Isn't the reason why we went to Collections was to lower the barrier to entry for contributors? Sounds like the project is slowly reverting back to being a bottleneck for innovation...

Collections can do whatever they want. There is no restrictions. What's different is the Ansible community distribution. It is a collection of modules that should stick to a common set of guidelines. If a collection choses to ignore these guidelines, they will have trouble getting included. But nobody stops the collection from being available on Ansible Galaxy (or other places in the web), and nobody stops users from installing it and using it.

ptoal commented 2 years ago

Collections can do whatever they want. There is no restrictions. What's different is the Ansible community distribution. It is a collection of modules that should stick to a common set of guidelines. If a collection choses to ignore these guidelines, they will have trouble getting included. But nobody stops the collection from being available on Ansible Galaxy (or other places in the web), and nobody stops users from installing it and using it.

As a user who likes the networking resource modules, my current preference is for option 1, but before I commit to that, I have a question:

What is the drawback of including the Resource Module pattern?

From what I've read, it seems like it's mostly based on the opinion that it causes confusion, or that it is somehow not "the Ansible Way", but most of the arguments for including it seem to be centred around feedback from users that they prefer that pattern. I guess I'm just confused about what the downside of this is?

sivel commented 2 years ago

The resource module pattern violates the accepted principal that an Ansible module or plugin should follow the unix "philosophy of doing one thing well". See https://docs.ansible.com/ansible/latest/dev_guide/developing_modules.html

Based on that principal, there are accepted ways to implement a module or plugin.

  1. A module should manage a single resource, and follow accepted standards for states of a resource. Such an example of a state that does not match is something like state: gathered as that does not define a state of the managed resource.
  2. A module that gathers info about a non-inventory resource should be implemented as an _info module.
  3. A module that gathers info about an inventory resource should be implemented as a _facts module.
  4. Plugins intended to manipulate data should be implemented as a filter plugin
  5. Plugins intended to verify or compare data should be implemented as a test plugin
  6. Plugins that need to fetch remote data, but are geared towards templating should be implemented as a lookup plugin
jillr commented 2 years ago

I added my vote but I didn't elaborate on it, so in addendum.

When I look at this with my Ansible Community hat on, Brad and the network folks have a well documented and scoped, prescriptive architecture that is already in widespread adoption. That architecture is guided by a team of trusted engineers who are experts at Ansible development. While "most" modules - except that one exception, ok really two exceptions, and maybe a few over here - are developed in a generally consistent way to each other I'd disagree that there's only one obvious or "correct" way to write modules. I can point to probably half a dozen different ways to write modules in the AWS collections alone that all presumably seemed obvious to the original author(s).

We have widespread adoption of the resource modules pattern. It's enabled a bunch of people to create Ansible content, whether we as individual developers like the way that code smells or not. The resource module pattern and the content developed from it seem to be far more consistent than many modules or collections and that's a huge win for our users. I can't see any downsides other than "we didn't use to do it that way".

IMO the right thing for the health of the community is to +1 it

ptoal commented 2 years ago

@sivel I see a lot of "shoulds" in those definitions, not "musts". I have spent a lot of time working on network interop problems that often tracked down to interpretation of standards with the words "should" and "must", so I apologize for being pedantic about this.

Are we talking about removing all the existing resource modules from the distribution? Are we going down the path of another fork that includes Orthodox Ansible + Networking?

Can someone explain the negative impact on the Ansible project or community from adopting this module pattern? Preferably the ELI5 version, because I think I'm still missing something.

ganeshrn commented 2 years ago

@sivel

The resource module pattern violates the accepted principal that an Ansible module or plugin should follow the unix "philosophy of doing one thing well".

The philosophy of doing one thing well can be interpreted in many ways and can be applied to Ansible module development 1) Write a lightweight wrapper on top of API which does one thing well of executing the API and sending the response back to the user 2) Interpretation that you described 3) Resource module wherein resource is considered as a single entity all aspects of resource management is taken care by the module including data transformation.

While there is consensus that the first pattern is not correct, saying that only your interpretation is correct and the resource module pattern interpretation is a violation doesn't go well with the bulk of Ansible content "community" (Yes I and many others like me who can't be part of IRC meetings due to various reasons consider themselves as very much part of Anisble community).

In the end, I believe whichever philosophy and it's interpretation we adhere to the end goal should be to make the life of users/playbook writers easier and not that of module developers :-)

felixfontein commented 2 years ago

What is the drawback of including the Resource Module pattern?

From what I've read, it seems like it's mostly based on the opinion that it causes confusion, or that it is somehow not "the Ansible Way", but most of the arguments for including it seem to be centred around feedback from users that they prefer that pattern. I guess I'm just confused about what the downside of this is?

@sivel already elaborated on this. Next to the link provided by him, https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_best_practices.html#scoping-your-module-s is pretty central as well.

Are we talking about removing all the existing resource modules from the distribution?

We are not. We also did not remove certain other modules (exceptions) that violated some of these principals after they have been added in the past. "gathered" is not a state

Can someone explain the negative impact on the Ansible project or community from adopting this module pattern?

For me, the main negative impact is that resource modules encourage two bad practices:

  1. Having things as "state" that are not state - "gathered" is simply not a state.
  2. It encourages users to ignore that there is a whole plugin infrastructure tailored to data processing and decision making. I am not arguing that users should write long jinja2 expressions for everything they want to transform or check. But I'm convinced that it is possible to provide enough plugins that almost all common cases can be covered by very short and easily understandable expressions.

Resource modules were introduced despite them being in violation to guidelines that were set up a long time ago. It would definitely be better if the parts of resource modules which violate these principles - I think the main issue is state=gathered - would have been discussed explicitly before resource modules were introduced and spread to all over networking and security.

Preferably the ELI5 version, because I think I'm still missing something.

What's a "ELI5 version"? (I assume you do not mean https://eli5.readthedocs.io/en/latest/)

ganeshrn commented 2 years ago

. Resource modules were introduced despite them being in violation to guidelines that were set up a long time ago

As you mentioned the guidelines were setup a long time ago and and long time ago Ansible was mainly used to manage server infrastructure. Over time Ansible uses cases has evolved into managing cloud, network, security, edge verticals and maybe it's time to revisit the guidelines.

BTW based on the guidelines that @sivel shared

A module that gathers info about a non-inventory resource should be implemented as an _info module.

I see number of module following the info pattern violates above guideline and returns information about inventory resource as part of info module response. Maybe that's because the API used to fetch information doesn't understand what is Ansible inventory and non-inventory resource.

felixfontein commented 2 years ago

. Resource modules were introduced despite them being in violation to guidelines that were set up a long time ago

As you mentioned the guidelines were setup a long time ago and and long time ago Ansible was mainly used to manage server infrastructure. Over time Ansible uses cases has evolved into managing cloud, network, security, edge verticals and maybe it's time to revisit the guidelines.

That is definitely true, but there's a big difference between discussing whether to change guidelines before starting to ignore the guidelines large-scale and discussing whether to change guidelines after starting to ignore them large-scale. Instead of being at a point where we can discuss whether it makes sense to adjust the resource module specification to better fit with the existing guidelines (which could also involve adjusting the guidelines), we are now in a situation where we basically have to say either yes to the complete resource module specification, or say no to it. This really sucks and wastes a lot of time for everyone, and is extremely frustrating.

justjais commented 2 years ago

we are now in a situation where we basically have to say either yes to the complete resource module specification, or say no to it

If the Yes makes resource modules design to co-exist with existing present/absent and facts/info modules, giving the module creator an option to go with either of the two and in turns gives users and community best way possible for configuring objects be it network/security or any other domain which justifies and is suitable for the use-case, I don't understand why there's a need to hard-line the specification and limit the design to follow just one of the two.

ptoal commented 2 years ago

Instead of being at a point where we can discuss whether it makes sense to adjust the resource module specification to better fit with the existing guidelines (which could also involve adjusting the guidelines), we are now in a situation where we basically have to say either yes to the complete resource module specification, or say no to it. This really sucks and wastes a lot of time for everyone, and is extremely frustrating.

First, apologies for the reddit term, ELI5. "Explain it like I'm five years old".

Based on the discussion, I think the ELI5 message is:

"Resource modules have implemented a pattern which is contrary to the guidance of the project. This has created two ways of gathering information from resources; the '_info module' pattern, and the 'state: gathered' pattern. The Resource Module pattern was accepted into the project years ago, and is now in widespread use by end-users (mostly in the networking discipline). Removing those modules now would negatively impact those users. Allowing more modules that implement the Resource Module pattern would create ambiguity that could cause confusion for other users."

Damned if you do, damned if you don't. I understand the frustration.

This discussion leads me to a question (which I'm going to ask even though it may make me unpopular): Is there a reason (other than a historical decision) for having separate _info modules? It feels strange to me that all operations except "get" operations are carried out by one module. Especially since the module that makes changes has to "get" state anyway.

Personally, I prefer the Resource Module pattern over the _info or _facts pattern, because I like the idea of having one entity managed by one module, and I find the creation of "read-only" modules creates a lot of noise when I'm looking through docs, trying to find the module I need (eg: the azure modules).

jillr commented 2 years ago

@ptoal

This discussion leads me to a question (which I'm going to ask even though it may make me unpopular): Is there a reason (other than a historical decision) for having separate _info modules? It feels strange to me that all operations except "get" operations are carried out by one module. Especially since the module that makes changes has to "get" state anyway.

I may be wrong but I've always looked at it as something like: _info modules perform Read operations and no other. A user should have complete confidence that running an _info module will never perform and Create, Update, Delete (CUD) operations. _info modules are for safely gathering information for the purpose of being used elsewhere (an assertion, another task, etc). A non-info module should be expected to make some CUD action unless run in check_mode and needs extra safety (like using check_mode) when testing or validating playbooks.

I can appreciate that 5 years ago the community might not have agreed to the resource module pattern because it violates the principles discussed here. But that ship has sailed at this point. There were years for people to hash out concerns over the resource module pattern.

Y'all, we built something and people are using it. Some of them are using it in ways that maybe we didn't anticipate, and maybe even ways we don't always like. But they're using our software to do awesome stuff. It's not like this proposal is to throw away all semblance of caring about what modules or collections look like, it's a proposal to formalize acceptance of an already established and well articulated additional specification. Let's help our users and community keep doing awesome new things with Ansible.