Alfresco / alfresco-ansible-deployment

Ansible playbooks for deploying ACS
https://alfresco.github.io/alfresco-ansible-deployment/
Apache License 2.0
29 stars 33 forks source link

Limit the use of inventory_hostname #623

Closed morgan-patou closed 1 year ago

morgan-patou commented 1 year ago

Summary

inventory_hostname might not be the real hostname of the target host and might contain characters that aren't supported by hostnames or that can cause issues with path.

Details

As you know, inventory_hostname comes from the inventory and therefore, I believe it's a kind of free-text name, which can contain characters such as /. Recently, a lot of addition was done on the Java/PKI side and this variable is being used a lot more. In cases where such characters are being used in the inventory_hostname, then some of the commands added recently would fail.

For example:

        - name: Create private key for new certificate
          no_log: true
          become: true
          community.crypto.openssl_privatekey:
            path: /etc/pki/{{ inventory_hostname }}_{{ cert_key_type | default('') }}.key
            mode: 0600
            size: "{{ cert_key_size | default(omit) }}"
            type: "{{ cert_key_type | default(omit) }}"
            return_content: true
          register: srvkey

Because inventory_hostname is being used inside the value of a path, if this variable starts with infra/myhost, then this task would fail because the folder /etc/pki exists but not /etc/pki/infra/.

Hostname are following the regex [a-zA-Z0-9.-]* and therefore inventory_hostname might not be the real value.

For all the Search Replication role, you are also using inventory_hostname to define if the current host is a master or a slave with things such as {{ 'slave' if inventory_hostname == search_master else 'master' }} or {% if search_topology == 'replication' and inventory_hostname == search_master %}. I find these conditions to be quite strange, since that would mean that you are forcing the inventory_hostname to be exactly "search_master" no? Maybe I'm missing some things for the replication part.

Questions

Is it possible to change such occurrences of inventory_hostname so that it uses the real hostname which I believe would be more secure in terms of allowed characters? It could mean replacing most occurrences with ansible_hostname or ansible_facts['hostname'] for example (these two requires gather_facts: true). If the real hostname changes (not the inventory one), the SSL Certificate should be regenerated I assume, so using the inventory one might be a problem as well no?

Alternatively, what about using a simple hardcoded name for the PKI file instead of generating a name based on the host? Something like alf_server_cert for all the .p12, .keystore, etc... So that all hosts would have the same name for the PKI files. Maybe the customers could have their own files already present with the hostname, so the playbook could also have some issues with that. Using a dedicated name for this playbook could help with avoiding issues and making sure which files are playbook related and which ones aren't (created before/after by someone/something else).

--> If you agree to perform some changes related to this issue, I can help getting some work done & submit the associated PR.

gionn commented 1 year ago

Is it possible to change such occurrences of inventory_hostname so that it uses the real hostname which I believe would be more secure in terms of allowed characters? It could mean replacing most occurrences with ansible_hostname or ansible_facts['hostname'] for example (these two requires gather_facts: true). If the real hostname changes (not the inventory one), the SSL Certificate should be regenerated I assume, so using the inventory one might be a problem as well no?

I think indeed it's more appropriate to use ansible_hostname which is tied to the node fqdn and for pki I think it was intended to be referenced for this.

For all the Search Replication role, you are also using inventory_hostname to define if the current host is a master or a slave with things such as {{ 'slave' if inventory_hostname == search_master else 'master' }} or {% if search_topology == 'replication' and inventory_hostname == search_master %}

I don't have a firm grasp on this since is a quite new feature, I imagine the intention is to have the first node with search role to act as a master and all the other as slave, so no matter which is the attribute you use but needs to be unique/different for each node.

let's wait a bit for @alxgomz take on this, for me it's fine to try addressing this stuff in multiple PR (if possibile)

alxgomz commented 1 year ago

Hi @morgan-patou , thanks for the feedback.

Is it possible to change such occurrences of inventory_hostname...

I don't think we should address all occurrences of inventory_hostname blindly. First of all that would open a large field for regressions. Second, although the inventory_hostname variable is free form it's good practice to comply to valid variable names. It also common practice to use the host fqdn in the inventory (so you don't have to set all other properties. Also take a look at the fetch module. You'll see that the dest parameter actually leverage the inventory_hostname. I would also argue that using the "/" or similar separator (as I assume this is a separator) is here to create a hierarchy or something similar to tag. Both would be better achieved by using hosts vars or groups IMHO. Ultimately, we have some places where the inventory_hostname is used as a fallback in configuration in case the network setup of targets is not optimal (e.g. all fqdn names are not resolvable by all target nodes). For the specific case of the pki.yml playbook, we can easily deal with it by making sure the destination exists. Another would be to hash the inventory_hostname. Either way, that would indeed be a welcomed PR.

For all the Search Replication role, you are also using inventory_hostname

Regarding the search replication tasks, @gionn is right the idea is to avoid introducing yet another variable to define which is the solr master and instead rely on the ordering of the hosts within the search group.

Alternatively, what about using a simple hardcoded name for the PKI file instead of generating a name based on the host...

I don't think it's possible. We need to be able to map a p12 to its specific host. To conform to x509 security, and have a certificate that truly identifies a host, private key is generated on the target host and so is the signed cert request. So it is key that we can map the right p12 to the right host later on in the playbook.

morgan-patou commented 1 year ago

Is it possible to change such occurrences of inventory_hostname...

I don't think we should address all occurrences of inventory_hostname blindly.

Clearly, that's not the ask. There are places where the inventory_hostname appears to be necessary like inside the roles/repository/templates/*.j2 template files, but there are places where it's questionable.

I would also argue that using the "/" or similar separator (as I assume this is a separator) is here to create a hierarchy or something similar to tag. Both would be better achieved by using hosts vars or groups IMHO.

I believe sometimes you have to use a separator and you don't have any choice. If I'm not wrong, the inventory can only contain unique names, even if they are on different groups. So assuming you are trying to have 2 different hosts with the same name (because they are on 2 different infra / cloud provider / separated network), then you cannot have them both into your ansible inventory and therefore you need to resort to separators.

For all the Search Replication role, you are also using inventory_hostname

Regarding the search replication tasks, @gionn is right the idea is to avoid introducing yet another variable to define which is the solr master and instead rely on the ordering of the hosts within the search group.

It's fine to take the 1st host as master, that's not the problem. My concern was more with the conditions being used that seem to force the inventory_hostname to be exactly search_master. I don't see anywhere any definition/overwrite of the inventory_hostname, so I can only assume that you expect the inventory yml file to include a host with a name search_master (and here it would go against your recommendation to always have the inventory_hostname with the hostname/fqdn). It appears to me that the search group can contain several hosts and the 1st one would just be taken as master for the replication. Shouldn't all actions rely on the search_topology: if search group contains 1 host, then standalone, else replication (as it is done inside acs.yml). And in case it's replication, then 1st host master, else slave. I don't see the point/purpose of having conditions that rely on inventory_hostname being equal/different to search_master. I might be missing something, but to me it just means that this just doesn't work, unless you have a very fixed inventory file with this hostname exactly and that you cannot change.

Alternatively, what about using a simple hardcoded name for the PKI file instead of generating a name based on the host...

I don't think it's possible. We need to be able to map a p12 to its specific host. To conform to x509 security, and have a certificate that truly identifies a host, private key is generated on the target host and so is the signed cert request. So it is key that we can map the right p12 to the right host later on in the playbook.

Right, since you are generating the certificates on localhost and then pushing them to the different hosts, it would be difficult to know which file goes where.

alxgomz commented 1 year ago

I believe sometimes you have to use a separator and you don't have any choice. If I'm not wrong, the inventory can only contain unique names, even if they are on different groups. So assuming you are trying to have 2 different hosts with the same name (because they are on 2 different infra / cloud provider / separated network), then you cannot have them both into your ansible inventory and therefore you need to resort to separators.

That's a perfect example of why it's a common practice to use the fqdn) as the inventory name. You can't have 2 hosts wit hthe same hostname AND domain name.

Regarding the solr replication, I think there is a miss-understanding. We're not forcing anything name wise. The search_master is a role argument which can be given when calling the role, or otherwise defaults to the first host in the role is applied to. See: https://github.com/Alfresco/alfresco-ansible-deployment/blob/c74ac5af110c46bb2c6f44dbf146d27585fc9207/roles/search/meta/argument_specs.yml#L69

You'll see also that it's even posible to specify the master in case you can't follow the expected ordering

Regarding search_topology the playbook already behaves as you stated: if the groups has more than 1 hosts then topology set to replication otherwise standalone. This is done here: https://github.com/Alfresco/alfresco-ansible-deployment/blob/c74ac5af110c46bb2c6f44dbf146d27585fc9207/playbooks/acs.yml#L121

It's done outside of the sear role as I want to aim in a direction where the roles are inventory agnostic. We're still far from that - and it's going to be a long journey - but that's the direction.

morgan-patou commented 1 year ago

Regarding the solr replication, I think there is a miss-understanding. We're not forcing anything name wise. The search_master is a role argument which can be given when calling the role, or otherwise defaults to the first host in the role is applied to. See:

https://github.com/Alfresco/alfresco-ansible-deployment/blob/c74ac5af110c46bb2c6f44dbf146d27585fc9207/roles/search/meta/argument_specs.yml#L69

You'll see also that it's even posible to specify the master in case you can't follow the expected ordering

Right, I don't know why, I thought search_master was a value while it's clearly a variable/argument as you said, my bad.

Regarding search_topology the playbook already behaves as you stated: if the groups has more than 1 hosts then topology set to replication otherwise standalone. This is done here:

https://github.com/Alfresco/alfresco-ansible-deployment/blob/c74ac5af110c46bb2c6f44dbf146d27585fc9207/playbooks/acs.yml#L121

It's done outside of the sear role as I want to aim in a direction where the roles are inventory agnostic. We're still far from that - and it's going to be a long journey - but that's the direction.

Yep that's clear, I saw that definition in the acs.yml.

morgan-patou commented 1 year ago

Since I created the PR #635, I believe that this issue can be closed since there is nothing more to discuss normally.