pytest-dev / pytest-testinfra

Testinfra test your infrastructures
https://testinfra.readthedocs.io
Apache License 2.0
2.38k stars 355 forks source link

ansible backend vs default backend (ssh?) -- ansible too slow? #56

Closed emayssat-ms closed 8 years ago

emayssat-ms commented 8 years ago

Here is the performance I see for 2 simple tests on different backends. It seems that ansible is much, much, much slower ! Admittedly ansible needs more time to update its inventory, but the task themselves are slow as if it were connecting to the machine at every single test.

Here is some sample output: 1/ test with IP provided 2/ test with ansible

<!> Also output is not the same for some unknown reason

$ testinfra --hosts=ubuntu@52.8.21.124,ubuntu@54.67.33.158,ubuntu@52.9.29.73,ubuntu@54.67.6.175,ubuntu@54.67.81.102 main.py ================================================= test session starts ================================================= platform linux2 -- Python 2.7.6, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 rootdir: /home/emayssat/workspaces/emayssat/devtools-vpc/devops/aws/release/tests, inifile: plugins: testinfra-1.0.0.0a21 collected 10 items

main.py ..........

============================================== 10 passed in 7.62 seconds ============================================== (aws)~/workspaces/emayssat/devtools-vpc/devops/aws/release/tests emayssat@xps13 $ make execute_tests AWS_PROFILE=7009 testinfra -v --connection=ansible --ansible-inventory=inventories/ec2_7009 --hosts=key_Release001 main.py ================================================= test session starts ================================================= platform linux2 -- Python 2.7.6, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- /home/emayssat/.virtualenvs/aws/bin/python cachedir: .cache rootdir: /home/emayssat/workspaces/emayssat/devtools-vpc/devops/aws/release/tests, inifile: plugins: testinfra-1.0.0.0a21 collected 10 items

main.py::test_passwd_file[ansible://52.8.21.124] PASSED main.py::test_surrogate_file[ansible://52.8.21.124] PASSED main.py::test_passwd_file[ansible://54.67.33.158] PASSED main.py::test_surrogate_file[ansible://54.67.33.158] PASSED main.py::test_passwd_file[ansible://52.9.29.73] PASSED main.py::test_surrogate_file[ansible://52.9.29.73] PASSED main.py::test_passwd_file[ansible://54.67.6.175] PASSED main.py::test_surrogate_file[ansible://54.67.6.175] PASSED main.py::test_passwd_file[ansible://54.67.81.102] PASSED main.py::test_surrogate_file[ansible://54.67.81.102] PASSED

============================================= 10 passed in 82.07 seconds ============================================== (aws)~/workspaces/emayssat/devtools-vpc/devops/aws/release/tests emayssat@xps13 $ make execute_tests AWS_PROFILE=7009 testinfra -v --connection=ansible --ansible-inventory=inventories/ec2_7009 --hosts=key_Release001 main.py ================================================= test session starts ================================================= platform linux2 -- Python 2.7.6, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- /home/emayssat/.virtualenvs/aws/bin/python cachedir: .cache rootdir: /home/emayssat/workspaces/emayssat/devtools-vpc/devops/aws/release/tests, inifile: plugins: testinfra-1.0.0.0a21 collected 10 items

main.py::test_passwd_file[ansible://52.8.21.124] PASSED main.py::test_surrogate_file[ansible://52.8.21.124] PASSED main.py::test_passwd_file[ansible://54.67.33.158] PASSED main.py::test_surrogate_file[ansible://54.67.33.158] PASSED main.py::test_passwd_file[ansible://52.9.29.73] PASSED main.py::test_surrogate_file[ansible://52.9.29.73] PASSED main.py::test_passwd_file[ansible://54.67.6.175] PASSED main.py::test_surrogate_file[ansible://54.67.6.175] PASSED main.py::test_passwd_file[ansible://54.67.81.102] PASSED main.py::test_surrogate_file[ansible://54.67.81.102] PASSED

============================================= 10 passed in 47.62 seconds ============================================== (aws)~/workspaces/emayssat/devtools-vpc/devops/aws/release/tests emayssat@xps13 $ make execute_tests AWS_PROFILE=7009 testinfra -v --connection=ansible --ansible-inventory=inventories/ec2_7009 --hosts=key_Release001 main.py ================================================= test session starts ================================================= platform linux2 -- Python 2.7.6, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- /home/emayssat/.virtualenvs/aws/bin/python cachedir: .cache rootdir: /home/emayssat/workspaces/emayssat/devtools-vpc/devops/aws/release/tests, inifile: plugins: testinfra-1.0.0.0a21 collected 10 items

main.py::test_passwd_file[ansible://52.8.21.124] PASSED main.py::test_surrogate_file[ansible://52.8.21.124] PASSED main.py::test_passwd_file[ansible://54.67.33.158] PASSED main.py::test_surrogate_file[ansible://54.67.33.158] PASSED main.py::test_passwd_file[ansible://52.9.29.73] PASSED main.py::test_surrogate_file[ansible://52.9.29.73] PASSED main.py::test_passwd_file[ansible://54.67.6.175] PASSED main.py::test_surrogate_file[ansible://54.67.6.175] PASSED main.py::test_passwd_file[ansible://54.67.81.102] PASSED main.py::test_surrogate_file[ansible://54.67.81.102] PASSED

============================================= 10 passed in 54.28 seconds ============================================== (aws)~/workspaces/emayssat/devtools-vpc/devops/aws/release/tests

philpep commented 8 years ago

Hi, indeed ansible backend is slowest because ansible is slowest (need 2 or 3 ssh/scp call to run a single command).

A common way to accelerate ansible is to use a persistent ssh connection. For example here are the relevant part of my ansible.cfg:

[defaults]
transport=ssh

[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ControlPath=~/.ssh/ansible-ssh-%h-%p-%r
pipelining=True

An other way to run tests faster is to use parallel execution using pytest-xdist: https://testinfra.readthedocs.org/en/latest/invocation.html#parallel-execution

muffl0n commented 8 years ago

Afaik it should be enough to just set

pipelining=True

I use it in all of my playbooks and it works pretty well. Ansible sets the ssh args automatically. No need to specify them.

doertedev commented 8 years ago

Somehow Codesearch doesn't really show me where to define the ansible config location. The Testinfra CLI params define an inventory, the ssh config, but the ansible config?

philpep commented 8 years ago

From the ansible documentation https://docs.ansible.com/ansible/intro_configuration.html

Changes can be made and used in a configuration file which will be processed in the following order:

* ANSIBLE_CONFIG (an environment variable)
* ansible.cfg (in the current directory)
* .ansible.cfg (in the home directory)
* /etc/ansible/ansible.cfg
doertedev commented 8 years ago

:+1: thanks a bunch!

retr0h commented 8 years ago

I also recommend against the following pattern when using the ansible backend. This will create a connection and execute crm_resource list for every entry of the list . In this case, it's faster to run crm_resource list once, and assert the result is what I expect.

I was unaware of this until recently, and thought I would share. This small change sped up our tests by 2/3rds.

@pytest.mark.parametrize('resource_group, resource', [
    ('admin-gateway-group', 'admin_gateway_ip'),
    ('admin-gateway-group', 'admin_snat_ip'),
    ('repository-vip-group', 'repository_vip'),
    ('haproxy-vip-group', 'haproxy_service_network_vip'),
    ('haproxy-vip-group', 'haproxy_admin_network_vip'),
    ('rsyslog-vip-group', 'rsyslog_vip'), ('percona-vip-group', 'percona_vip'),
    ('memcache-vip-group', 'memcache_vip'),
    ('neutron-dhcp-agent-group', 'neutron-dhcp-agent'),
    ('zabbix-proxy-vip-group', 'zabbix_proxy_vip')
])
def test_crm_resource_groups(Command, resource_group, resource):
    cmd = 'crm_resource list'
    out = Command.check_output(cmd)

    assert re.search(
        r'Resource Group: {}.*{}.*Started'.format(resource_group, resource),
        out, re.DOTALL)
philpep commented 8 years ago

@retr0h in this case you could make a fixture (module scoped) that return crm_resource list and keep parameterized test:

@pytest.fixture(scope="module")
def crm_resource_list(Command):
    return Command.check_output("crm_resource list")

@pytest.mark.parametrize('resource_group, resource', [
...
])
def test_crm_resource_groups(crm_resource_list, resource_group, resource):
    assert re.search(
        r'Resource Group: {}.*{}.*Started'.format(resource_group, resource),
        crm_resource_list, re.DOTALL)
philpep commented 8 years ago

I keep this issue open because the ansible backend is still too slow even with ansible.cfg tunning. I must dig into the ansible code to see what happen and what we can do.

retr0h commented 8 years ago

@philpep please keep us posted. My tests are dreadfully slow with testinfra, while using the ansible connection. The following test below is executing across 3 targets and took 30+ seconds to complete. I have many test files using the Ansible fixture, which seems to only make it worse.

def test_variables(Ansible):
    az = Ansible('debug', "msg={{ availability_zone }}")['msg']
    local = Ansible('debug', "msg={{ cinder_local }}")['msg']
    variables =  Ansible.get_variables()

Executed with:

[jodewey:~/git/ansible-systems] [venv](+113/-96)+ 2 ± time testinfra -n 3 -v --connection=ansible --ansible-inventory $ANSIBLE_INVENTORY roles/integration_tests/tests/integration/test_keystone.py
==================================================================== test session starts =====================================================================
platform darwin -- Python 2.7.11, pytest-2.9.1, py-1.4.31, pluggy-0.3.1 -- /Users/jodewey/git/ansible-systems/venv/bin/python
cachedir: .cache
rootdir: /Users/jodewey/git/ansible-systems, inifile:
plugins: xdist-1.14, testinfra-1.2.0
[gw0] darwin Python 2.7.11 cwd: /Users/jodewey/git/ansible-systems
[gw1] darwin Python 2.7.11 cwd: /Users/jodewey/git/ansible-systems
[gw2] darwin Python 2.7.11 cwd: /Users/jodewey/git/ansible-systems
[gw0] Python 2.7.11 (default, Apr 17 2016, 16:22:05)  -- [GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)]
[gw1] Python 2.7.11 (default, Apr 17 2016, 16:22:05)  -- [GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)]
[gw2] Python 2.7.11 (default, Apr 17 2016, 16:22:05)  -- [GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)]
gw0 [1] / gw1 [1] / gw2 [1]
scheduling tests via LoadScheduling

test_keystone.py::test_variables[ansible://mcp1.paslab016005.mc.metacloud.in]
[gw0] PASSED test_keystone.py::test_variables[ansible://mcp1.paslab016005.mc.metacloud.in]

================================================================= 1 passed in 29.52 seconds ==================================================================
testinfra -n 3 -v --connection=ansible --ansible-inventory $ANSIBLE_INVENTORY  38.53s user 1.15s system 132% cpu 29.921 total
retr0h commented 8 years ago

So we use a custom vars plugin similar to ursula's.

That accounts for some slowness, but I think we have a larger issue. When running the test from the above report, we load inventory across every host/per test. I updated the plugin to print the name of the host the facts are being gathered for. You will see testinfra continually gathers facts for each test.

$ time testinfra -vs --connection=ansible --ansible-inventory $ANSIBLE_INVENTORY roles/integration_tests/tests/integration/test_keystone.py

platform darwin -- Python 2.7.11, pytest-2.9.1, py-1.4.31, pluggy-0.3.1 -- /Users/jodewey/git/ansible-systems/venv/bin/python
cachedir: .cache
rootdir: /Users/jodewey/git/ansible-systems, inifile:
plugins: xdist-1.14, testinfra-1.2.0
collecting 0 items
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3
collected 1 items

roles/integration_tests/tests/integration/test_keystone.py::test_variables[ansible://mcp1]
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3

Enabled 2 more hosts to run the test against.

roles/cisco.metapod_integration_tests/tests/integration/test_keystone.py::test_variables[ansible://mcp1]
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3

roles/integration_tests/tests/integration/test_keystone.py::test_variables[ansible://mcp2]
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3

roles/integration_tests/tests/integration/test_keystone.py::test_variables[ansible://mcp3] 
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3
mcp1
mcp2
mcp3
mhv1
mhv2
mhv3

So the more tests we write, and the more hosts we target, the slower it will get. Seems like we should gather once.

philpep commented 8 years ago

@retr0h interesting ! I didn't catch the facts gathering... I saw multiple other bug in ansible connection especially with ansible 2 (like ignoring ansible.cfg, pipelining and so on). I tried to get it fixed in #112 , there is still some work to do, I'll continue as soon as I can. Btw I'll ask your review on it before merging because there may be behavior changes.

retr0h commented 8 years ago

FYI - with the patch from #112, my test suite has improved significantly.

It went from 29 passed in 598.15 seconds to 29 passed in 106.27 seconds.

I would be in favor of cutting a release sooner than later, simply for speed's sake 😄 👍 🍰

philpep commented 8 years ago

Now #112 is released with a major performance improvement, I close this one. Opened #116 about fact gathering.