linux-system-roles / vpn

Role for managing VPN/IPSec
https://linux-system-roles.github.io/vpn/
MIT License
8 stars 14 forks source link

Must use hostname: host under each host in hosts or strange errors #14

Closed richm closed 3 years ago

richm commented 3 years ago

an inventory like this will cause errors:

all:
  hosts:
    host1.example.com:
      ansible_host: 10.0.76.221
    host2.example.com:
      ansible_host: 10.0.76.107
    host3.example.com:
      ansible_host: 10.0.76.242

  vars:
    vpn_connections:
      - hosts:
          host1.example.com:
          host2.example.com:
          host3.example.com:

This will generate ipsec conf files with a mix of hostnames and ip addresses e.g.

conn host2.example.com-to-host1.example.com
  left=host2.example.com
  leftid=@host2.example.com
  right=10.0.76.221
  rightid=@10.0.76.221
  ikev2=insist
  auto=start
  rekey=yes
  authby=secret

and a secrets file like this:

@host2.example.com @host1.example.com : PSK "some value"

This will generate some really strange, inscrutable errors in systemctl status ipsec - you will see something in the output of ipsec whack --status but absolutely nothing in the output of ipsec whack --trafficstatus. I think it is because right/rightid is an IP address in the .conf file and the FQDN in the .secrets file.

You have to force it to use the FQDN everywhere by using the hostname: host redundantly like this:

  vars:
    vpn_connections:
      - hosts:
          host1.example.com:
            hostname: host1.example.com
          host2.example.com:
            hostname: host2.example.com
          host3.example.com:
            hostname: host3.example.com

I think using the hostname here shouldn't be necessary. But perhaps the problem is that there is no way to know in

      - hosts:
          host1.example.com:

if host1.example.com here is usable as a right/rightid? If that is the problem, then I guess maybe the problem could be fixed by always using the hosts key for left/leftid in both the .conf file and the .secrets file, and using the IP address for right/rightid in both the .conf file and the .secrets file?

letoams commented 3 years ago

If you have working hostnames, that I would say to prefer those and put them in. But if the sysadmin makes up the hostnames here and the hosts cannot resolve those, then you will have a problem. Perhaps the easiest workaround is to add the secrets entry with both hostname and IP? eg it could have generated this:

10.0.76.107 @host2.example.com 10.0.76.221 @host1.example.com : PSK "some value"

This line will be matched if rightid=@host2.example.com or if rightid=10.0.76.107

mprovenc commented 3 years ago

@richm to your point, it seems to me that in this example another potential issue is that the role shouldn't be using host2.example.com in the leftid field since it is not necessarily a resolvable FQDN on the target (it's just an ansible alias). This problem goes away if we want to make the assumption that these ansible aliases given in the inventory are resolvable FQDNs. Is this a viable assumption? If that is the case, then I feel that the fix for all of the above issues could be as easy as changing lines 19-23 in the .conf template file. Currently these lines prefer the ansible_host value over the host alias:

{%     else %}
  right={{ hostvars[item].ansible_host | d(item) }}
{%       if tunnel.auth_method == 'psk' %}
  rightid=@{{ hostvars[item].ansible_host | d(item) }}
{%       endif %}

However, if we assume that the host alias is a resolvable FQDN on the target machines, we could just always take the alias. This would produce a conf file like this:

conn host2.example.com-to-host1.example.com
  left=host2.example.com
  leftid=@host2.example.com
  right=host1.example.com
  rightid=@host1.example.com
  ikev2=insist
  auto=start
  rekey=yes
  authby=secret

What Paul shared is definitely useful information as well, but I think once we know exactly what we are putting into the rightid field of the config file, it will become clearer what changes, if any, need to be applied to the secrets file.

richm commented 3 years ago

This problem goes away if we want to make the assumption that these ansible aliases given in the inventory are resolvable FQDNs. Is this a viable assumption?

No. As described here: https://docs.ansible.com/ansible/2.9/user_guide/intro_inventory.html - it is acceptable for the hosts in the hosts list to be aliases:

all:
  hosts:
    alias_for_host1:
      ansible_host: 10.0.76.221
    alias_for_host2:
      ansible_host: 10.0.76.107
    alias_for_host3:
      ansible_host: 10.0.76.242

I don't know how common it is to do something like this, but I have seen this sort of thing done in the old openshift-ansible inventories.

So we can't rely on the host key in the hosts list to be a real FQDN. I guess we will need to document something along the lines of "If your host key in your hosts list in your inventory is not the FQDN you want to use, you must use the hostname field under each host in the vpn_connections hosts list to specify the actual FQDN or IP address you want the vpn role to use to set up the tunnel. If you do not specify hostname, then the role will use ansible_host if defined, or the host key in your hosts list if neither ansible_host nor hostname is defined." We can assume ansible_host is a valid FQDN or IP address, but note that it might also be different than the the actual FQDN or IP address you want the vpn role to use. The ansible_host is the FQDN or IP address to use for that node from the controller machine where you are running ansible. If you are managing hosts in a cloud, ansible might have to use an ephemeral external hostname or ip address in the ansible_host field to manage those hosts, but that hostname/ip address might not resolve correctly when the machines in the cloud talk to each other. I guess what I'm saying is that it is ok to use ansible_host then the host key in the hosts list as fallbacks if hostname is not defined, but this may give strange errors.

mprovenc commented 3 years ago

That solution sounds good to me. Do we want the connection name itself to reflect the same value that is in left/leftid and right/rightid (e.g. "conn 10.0.76.221-to-host2.example.com" if ansible_host is defined for the first host but neither ansible_host nor hostname is defined for the second host) or should the host key always be used? I think the implementation is a little inconsistent right now with how it's naming connections.

richm commented 3 years ago

That solution sounds good to me. Do we want the connection name itself to reflect the same value that is in left/leftid and right/rightid (e.g. "conn 10.0.76.221-to-host2.example.com" if ansible_host is defined for the first host but neither ansible_host nor hostname is defined for the second host) or should the host key always be used? I think the implementation is a little inconsistent right now with how it's naming connections.

I think it should be "conn $leftid-$rightid" to be consistent - then the actual configuration will match the "conn" line which will also match what's in the .secrets file. But perhaps @letoams has a better suggestion.

letoams commented 3 years ago

On Wed, 17 Feb 2021, Richard Megginson wrote:

  That solution sounds good to me. Do we want the connection name itself to reflect the same value that is in
  left/leftid and right/rightid (e.g. "conn 10.0.76.221-to-host2.example.com" if ansible_host is defined for
  the first host but neither ansible_host nor hostname is defined for the second host) or should the host key
  always be used? I think the implementation is a little inconsistent right now with how it's naming
  connections.

I think it should be "conn $leftid-$rightid" to be consistent - then the actual configuration will match the "conn" line which will also match what's in the .secrets file. But perhaps @letoams has a better suggestion.

This is really more up to the ansible common practises on these items. In general, hostnames are more meaningful to people than IPs, so using hostnames there is probably better.

Paul

mprovenc commented 3 years ago

This is really more up to the ansible common practises on these items. In general, hostnames are more meaningful to people than IPs, so using hostnames there is probably better.

Okay, correct me if I'm wrong, but I think that doing it the way Rich suggested would satisfy that requirement since the order of preference for leftid/rightid would be (1) resolvable user-specified hostname (2) ansible_host (3) host key value.