michaelrigart / ansible-role-interfaces

An ansible role for configuring different network interfaces
GNU General Public License v3.0
83 stars 61 forks source link

Check active Ethernet interface state complaints for sub interfaces #35

Closed avadhanij closed 6 years ago

avadhanij commented 6 years ago

When adding sub-interfaces (eth1:1), most of the role works fine except for the last check task. I am new to Ansible in general, so I was wondering if I was missing something or if there is a way to recognize the sub interface.

itlinux commented 6 years ago

You may want to share your playbook if you need help.

Inviato da iPhone

Il giorno 09 mar 2018, alle ore 11:10, avadhanij notifications@github.com ha scritto:

When adding sub-interfaces (eth1:1), most of the role works fine except for the last check task. I am new to Ansible in general, so I was wondering if I was missing something or if there is a way to recognize the sub interface.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

avadhanij commented 6 years ago

Yep, here is a piece of the variable file for one of the hosts inside host_vars


device_name: unix1 interfaces_ether_interfaces:

When I run the playbook, I get in the end -

RUNNING HANDLER [MichaelRigart.interfaces : Check active Ethernet interface state] *** skipping: [unix1] => (item={'device': 'eth4', 'bootproto': 'static', 'address': '192.168.54.1', 'netmask': '255.255.255.0'}) => {"changed": false, "item": {"address": "192.168.54.1", "bootproto": "static", "device": "eth4", "netmask": "255.255.255.0"}, "skip_reason": "Conditional result was False"}

skipping: [unix1] => (item={'device': 'eth4.416', 'bootproto': 'static', 'address': '172.24.16.2', 'netmask': '255.255.255.0', 'route': [{'network': '172.23.16.1', 'netmask': '255.255.255.0', 'gateway': '172.24.16.1'}]}) => {"changed": false, "item": {"address": "172.24.16.2", "bootproto": "static", "device": "eth4.416", "netmask": "255.255.255.0", "route": [{"gateway": "172.24.16.1", "netmask": "255.255.255.0", "network": "172.23.16.1"}]}, "skip_reason": "Conditional result was False"}

failed: [unix1] (item={'device': 'eth4.416:1', 'bootproto': 'static', 'address': '172.24.16.11', 'netmask': '255.255.255.0'}) => {"changed": false, "item": {"address": "172.24.16.11", "bootproto": "static", "device": "eth4.416:1", "netmask": "255.255.255.0"}, "msg": "Interface eth4.416:1 does not exist"}

markgoddard commented 6 years ago

Hi @avadhanij, I've not tried subinterfaces with this role. Perhaps you could add the output of these commands:

ip link ansible localhost -m setup

avadhanij commented 6 years ago

The setup info is huge, not sure I can/want to paste it all.... I am trying to use the role for automated setup and teardown of testbeds at my company.

Sub-interfaces aren't real interfaces, they are just a way of registering multiple IP addresses to one physical device.

Here is the ip link out put ..

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:50:56:90:b0:a2 brd ff:ff:ff:ff:ff:ff 3: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:50:56:90:26:7a brd ff:ff:ff:ff:ff:ff ...... 22: eth4.416@eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 00:50:56:90:26:7a brd ff:ff:ff:ff:ff:ff

If you notice, from the ifconfig output - The HWaddr is always the same.

eth4.416 Link encap:Ethernet HWaddr 00:50:56:90:26:7a inet addr:172.24.16.2 Bcast:172.24.16.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:fe90:267a/64 Scope:Link inet6 addr: fd49:f9f5:ccb4:2acd::ac18:1002/120 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5273 errors:0 dropped:0 overruns:0 frame:0 TX packets:8142 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:267801 (267.8 KB) TX bytes:450638 (450.6 KB)

eth4.416:1 Link encap:Ethernet HWaddr 00:50:56:90:26:7a inet addr:172.24.16.11 Bcast:172.24.16.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth4.416:2 Link encap:Ethernet HWaddr 00:50:56:90:26:7a inet addr:172.24.16.12 Bcast:172.24.16.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth4.416:3 Link encap:Ethernet HWaddr 00:50:56:90:26:7a inet addr:172.24.16.13 Bcast:172.24.16.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

markgoddard commented 6 years ago

The information I was interested in from the ansible facts is the interfaces is the name of the fact for the interface, as this is used to check the state of the interface in the failing task. Typically it's called ansible_<interface>, with dashes replaced with underscores. I'm not sure what it does with the colons.

Are you planning to fix this issue?

avadhanij commented 6 years ago

This is how the interfaces are showing up -

            "eth4.416",
            "eth0",
            "eth4.406",
            "eth4.416_3",
            "eth4.416_2",
            "eth4.416_1",
            "eth4.416_7",
            "eth4.416_6",
            "eth4.416_5",
            "eth4.416_4",
"ansible_eth4.416_1": {
            "ipv4_secondaries": [
                {
                    "address": "172.24.16.11",
                    "broadcast": "172.24.16.255",
                    "netmask": "255.255.255.0",
                    "network": "172.24.16.0"
                }
            ]
        },

"ansible_eth4.416_2": {
            "ipv4_secondaries": [
                {
                    "address": "172.24.16.12",
                    "broadcast": "172.24.16.255",
                    "netmask": "255.255.255.0",
                    "network": "172.24.16.0"
                }
            ]
        },

Are you planning to fix this issue?

That's a good question. I don't know yet. I would certainly like to help, but I am still learning, so, it might take time. But since I have the use case setup, it's easy for me to test. Is there an expected time frame?

markgoddard commented 6 years ago

Thanks for following up. I see that ansible has replaced the colon in the interface name with an underscore. Could you try this patch: https://github.com/stackhpc/ansible-role-interfaces/tree/issues/35.

avadhanij commented 6 years ago

Sorry about the delay....and the previous comment. There seem to be some setup issues. After running your updated role, I am still getting a failure -

RUNNING HANDLER [MichaelRigart.interfaces : Check active Ethernet interface state] *** skipping: [unix1] => (item={'device': 'eth4', 'bootproto': 'static', 'address': '192.168.54.1', 'netmask': '255.255.255.0'}) => {"changed": false, "item": {"address": "192.168.54.1", "bootproto": "static", "device": "eth4", "netmask": "255.255.255.0"}, "skip_reason": "Conditional result was False"}

skipping: [unix1] => (item={'device': 'eth4.416', 'bootproto': 'static', 'address': '172.24.16.2', 'netmask': '255.255.255.0', 'route': [{'network': '172.23.16.1', 'netmask': '255.255.255.0', 'gateway': '172.24.16.1'}]}) => {"changed": false, "item": {"address": "172.24.16.2", "bootproto": "static", "device": "eth4.416", "netmask": "255.255.255.0", "route": [{"gateway": "172.24.16.1", "netmask": "255.255.255.0", "network": "172.23.16.1"}]}, "skip_reason": "Conditional result was False"}

fatal: [unix1]: FAILED! => {"msg": "The conditional check 'ether_check.diff' failed. The error was: 'KeyError' object has no attribute 'message'\n\nThe error appears to have been in '/Users/avadhanij/Ansible/testbed/roles/MichaelRigart.interfaces/handlers/main.yml': line 159, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Check active Ethernet interface state\n ^ here\n"}

markgoddard commented 6 years ago

@avadhanij please try the above branch again - it should now work.

markgoddard commented 6 years ago

@avadhanij I've created a PR, please comment: https://github.com/michaelrigart/ansible-role-interfaces/pull/36.

avadhanij commented 6 years ago

@markgoddard ....it still seems to fail in that task. I did a bit of digging, and it seems like the fact gathering is happening in an odd way. I am still learning about Ansible, so I hope this is helpful.

I ran ansible -m setup on one of the devices and saved the output. There I see that -

On a traffic interface like - "ansible_eth4.416". It's supposed to have an attribute called - "ipv4" right? But the main address 172.24.16.2 is just listed inside "ipv4_secondaries" like this -

 "ipv4_secondaries": [
            {
                "address": "172.24.16.14",
                "broadcast": "172.24.16.255",
                "netmask": "255.255.255.0",
                "network": "172.24.16.0"
            },
            {
                "address": "172.24.16.19",
                "broadcast": "172.24.16.255",
                "netmask": "255.255.255.0",
                "network": "172.24.16.0"
            },
            {
                "address": "172.24.16.2",
                "broadcast": "172.24.16.255",
                "netmask": "255.255.255.0",
                "network": "172.24.16.0"
            },

This is the "main" address because that's what I listed in the host_vars/unix1 file.

But for management interface - eth0, ipv4 does exist, and lists the correct address. Perhaps this is happening because there are secondary IPs for the traffic interface, and it's listing them altogether?

But then the following makes no sense

    "ansible_eth4.416_2": {
        "ipv4": {
            "address": "172.24.16.12",
            "broadcast": "172.24.16.255",
            "netmask": "255.255.255.0",
            "network": "172.24.16.0"
        }
    },
    "ansible_eth4.416_3": {
        "ipv4_secondaries": [
            {
                "address": "172.24.16.13",
                "broadcast": "172.24.16.255",
                "netmask": "255.255.255.0",
                "network": "172.24.16.0"
            }
        ]
    },
    "ansible_eth4.416_4": {
        "ipv4_secondaries": [
            {
                "address": "172.24.16.14",
                "broadcast": "172.24.16.255",
                "netmask": "255.255.255.0",
                "network": "172.24.16.0"
            }
        ]
    }

Why is it gathering all the address as ipv4_secondaries, but only one as as ipv4?

From what I can see in the filter _interface_check, I know that it's depending on the attribute called ipv4.address, and that's showing up as None because of this confusion.

Edit:

The reason for failure is -

failed: [unix1] (item={'device': 'eth4.413:5', 'bootproto': 'static', 'address': '172.24.13.15', 'netmask': '255.255.255.0'}) => {"changed": false, "item": {"address": "172.24.13.15", "bootproto": "static", "device": "eth4.413:5", "netmask": "255.255.255.0"}, "msg": "Interface eth4.413:5 has no IPv4 address"}

It's important to note that none this seems to impede the actual work being done, i.e., interfaces come up exactly the way I want. It's the checks that fail.

markgoddard commented 6 years ago

@avadhanij I tested my change on CentOS 7.4. I don't remember which version of ansible I used. Which OS and ansible version are you using?

Is this something you can finish off on your own?

avadhanij commented 6 years ago

I tested using Ansible 2.4.3, and then upgraded to 2.5, but I am still seeing the same. The OS on the target device is Ubuntu 14.04.

I guess I can finish it, but I wonder if this is a work around for an actual bug on Ansible's setup module.

avadhanij commented 6 years ago

Quick update. I tested the setup module on an Ubuntu 16.04, and it produces the output as expected.

"ipv4":{
            "address": "172.24.13.2",
            "broadcast": "172.24.13.255",
            "netmask": "255.255.255.0",
            "network": "172.24.13.0"
        },
        "ipv4_secondaries": [
            {
                "address": "172.24.13.17",
                "broadcast": "172.24.13.255",
                "netmask": "255.255.255.0",
                "network": "172.24.13.0"
            },
            {
                "address": "172.24.13.18",
                "broadcast": "172.24.13.255",
                "netmask": "255.255.255.0",
                "network": "172.24.13.0"
            },
            {
                "address": "172.24.13.19",
                "broadcast": "172.24.13.255",
                "netmask": "255.255.255.0",
                "network": "172.24.13.0"
 },

Which would mean that your code should work as expected. However, a tweak needed for the sub interfaces is that the ip address needs to be pulled from "ipv4_secondaries" rather than "ipv4"

"ansible_eth4.413_18": {
        "ipv4_secondaries": [
            {
                "address": "172.24.13.18",
                "broadcast": "172.24.13.255",
                "netmask": "255.255.255.0",
                "network": "172.24.13.0"
            }
        ]
    }

I can't say for sure if its the same behavior in Centos. I also wonder why 14.04 is not reporting correctly.

markgoddard commented 6 years ago

Ok, I see something similar on Ubuntu 16.04:

sudo ip a add 10.0.0.42/24 dev wlp2s0
ansible localhost -m setup
"ansible_wlp2s0": {
            "ipv4": {
                "address": "192.168.1.123", 
                "broadcast": "192.168.1.255", 
                "netmask": "255.255.255.0", 
                "network": "192.168.1.0"
            }, 
            "ipv4_secondaries": [
                {
                    "address": "10.0.0.42", 
                    "broadcast": "global", 
                    "netmask": "255.255.255.0", 
                    "network": "10.0.0.0"
                }
            ],
...
}

ansible --version
ansible 2.3.0.0
  config file = /home/mark/src/kolla-ansible/ansible.cfg
  configured module search path = Default w/o overrides
  python version = 2.7.12 (default, Dec  4 2017, 14:50:18) [GCC 5.4.0 20160609]

This behaviour seems constant on Ubuntu 16.04, tested back to ansible 2.0.

My PR was tested on CentOS 7 and expects there to be a fact for the subinterface, with the IP listed under ipv4 (not ipv4_secondaries as you see on Ubuntu 14.04).

Going back to CentOS 7, if I just to ip a add <IP> dev <dev> then I get the same facts as for Ubuntu 16.04:

        "ansible_eno3.45": {
            "ipv4": {
                "address": "10.45.253.102", 
                "broadcast": "10.45.255.255", 
                "netmask": "255.255.0.0", 
                "network": "10.45.0.0"
            }, 
            "ipv4_secondaries": [
                {
                    "address": "10.42.0.42", 
                    "broadcast": "global", 
                    "netmask": "255.255.255.0", 
                    "network": "10.42.0.0"
                }
            ],

If I add an ifcfg file for the interface (as this role does), then it gets its own fact:

        "ansible_eno3.45_1": {
            "ipv4": {
                "address": "10.40.0.42", 
                "broadcast": "10.40.0.255", 
                "netmask": "255.255.255.0", 
                "network": "10.40.0.0"
            }
        },
avadhanij commented 6 years ago

So...what is the way forward? Should OS based if conditions be introduced in the filter?

markgoddard commented 6 years ago

I think we can avoid OS conditions in the filter, but be more permissive about the format of the interface fact for subinterfaces. I think we've found 3 cases (so far):

  1. Subinterface has its own fact, with IP under "ipv4"
  2. Subinterface has its own fact, with IP under "ipv4_secondaries"
  3. Subinterface does not have its own fact, IP under "ipv4_secondaries" of the parent interface's fact

My proposed patch works for 1. Perhaps you could add support for 2 and 3?

avadhanij commented 6 years ago

I apologize for the delay, but I am currently waiting for approval from company legal to work on this and test on lab machines. I have access to more gear to test it out than if I would do it on my own.

avadhanij commented 6 years ago

I was able to modify the filter to work with all the three cases now. How should I push my code? I was working of the branch - issues/35. Should I just commit into this branch and push?

markgoddard commented 6 years ago

Ok, great. If you can push to the branch then do that, otherwise fork and create a new PR.

avadhanij commented 6 years ago

I wasn't able to, so I forked your repo and submitted a PR. I hope I did it right.