napalm-automation / napalm-base

Apache License 2.0
33 stars 48 forks source link

Added recorder and an example (to be removed) #296

Closed dbarrosop closed 6 years ago

dbarrosop commented 6 years ago

This PR introduces a class that can sit between the driver and the underlying library (pyeapi, netmiko, pyez, etc. see napalm-automation/napalm-eos/pull/173).

The "middleware" has three modes:

  1. pass. Does nothing. The Recorder will just proxy the calls to the underlying library.
  2. record. The interactions between the driver and the underlying library are proxied and recorded.
  3. replay. The interactions between the driver and the underlying library are interrupted. Instead, the Recorder will pickle a previous recording and replay it.

Example, in the following snippet we are running the same code three times testing all modes. Note that in the "replay" mode at the end I set wrong data to prove that there is no connection with the real device.

(napalm) ➜  napalm-base git:(recorder) ✗ python test.py
{u'fqdn': u'localhost',
 u'hostname': u'localhost',
 u'interface_list': [u'Ethernet1', u'Ethernet2', u'Management1'],
 u'model': u'vEOS',
 u'os_version': u'4.15.2.1F-2759627.41521F',
 u'serial_number': u'',
 u'uptime': 4147,
 u'vendor': u'Arista'}
{u'Ethernet1': {u'description': u'',
                u'is_enabled': True,
                u'is_up': True,
                u'last_flapped': 1502141344.0919724,
                u'mac_address': u'08:00:27:34:90:5B',
                u'speed': 0},
 u'Ethernet2': {u'description': u'',
                u'is_enabled': True,
                u'is_up': True,
                u'last_flapped': 1502141344.0921075,
                u'mac_address': u'08:00:27:07:C3:F0',
                u'speed': 0},
 u'Management1': {u'description': u'',
                  u'is_enabled': True,
                  u'is_up': True,
                  u'last_flapped': 1502141358.069374,
                  u'mac_address': u'08:00:27:7D:44:C1',
                  u'speed': 1000}}
napalm-base - DEBUG - Recording run_commands
napalm-base - DEBUG - Recording run_commands
{u'fqdn': u'localhost',
 u'hostname': u'localhost',
 u'interface_list': [u'Ethernet1', u'Ethernet2', u'Management1'],
 u'model': u'vEOS',
 u'os_version': u'4.15.2.1F-2759627.41521F',
 u'serial_number': u'',
 u'uptime': 4147,
 u'vendor': u'Arista'}
napalm-base - DEBUG - Recording run_commands
{u'Ethernet1': {u'description': u'',
                u'is_enabled': True,
                u'is_up': True,
                u'last_flapped': 1502141344.0919635,
                u'mac_address': u'08:00:27:34:90:5B',
                u'speed': 0},
 u'Ethernet2': {u'description': u'',
                u'is_enabled': True,
                u'is_up': True,
                u'last_flapped': 1502141344.0921032,
                u'mac_address': u'08:00:27:07:C3:F0',
                u'speed': 0},
 u'Management1': {u'description': u'',
                  u'is_enabled': True,
                  u'is_up': True,
                  u'last_flapped': 1502141358.0693803,
                  u'mac_address': u'08:00:27:7D:44:C1',
                  u'speed': 1000}}
napalm-base - DEBUG - Replaying run_commands
napalm-base - DEBUG - Replaying run_commands
{u'fqdn': u'localhost',
 u'hostname': u'localhost',
 u'interface_list': [u'Ethernet1', u'Ethernet2', u'Management1'],
 u'model': u'vEOS',
 u'os_version': u'4.15.2.1F-2759627.41521F',
 u'serial_number': u'',
 u'uptime': 4147,
 u'vendor': u'Arista'}
napalm-base - DEBUG - Replaying run_commands
{u'Ethernet1': {u'description': u'',
                u'is_enabled': True,
                u'is_up': True,
                u'last_flapped': 1502141344.0919635,
                u'mac_address': u'08:00:27:34:90:5B',
                u'speed': 0},
 u'Ethernet2': {u'description': u'',
                u'is_enabled': True,
                u'is_up': True,
                u'last_flapped': 1502141344.0921032,
                u'mac_address': u'08:00:27:07:C3:F0',
                u'speed': 0},
 u'Management1': {u'description': u'',
                  u'is_enabled': True,
                  u'is_up': True,
                  u'last_flapped': 1502141358.0693803,
                  u'mac_address': u'08:00:27:7D:44:C1',
                  u'speed': 1000}}

TODO:

  1. Make sure we can capture exceptions as well
  2. Decide if we want to use pickle or serialize/deserialize objects. 99% of the interactions are going to be via dicts, json or xml so it should be doable. Exceptions might be trickier with doable as well.
dbarrosop commented 6 years ago

@ubaumann @mirceaulinic @ktbyers thoughts?

ubaumann commented 6 years ago

With this approach we could address the 'more verbose logging' issue as well: https://github.com/napalm-automation/napalm-base/issues/272 https://github.com/napalm-automation/napalm-base/issues/273

I will look deeper in the code asap

ubaumann commented 6 years ago

My findings so far:

With the files and the count number, the order of commands can't be changed. So, every workflow needs a separate record. I would like to have a more flexible way. For example creating a YAML file with the function name and the arguments, so the the recording can be reused. I know the file could get big.

send_command:
  - 'show version': 'asdfasdfasdfasdf'
  - 'show interface': 'asdfasdfasdfasdf'
enable: ''

Advantages from my point of view:

Recordings make it easy to test against different OS versions

mirceaulinic commented 6 years ago

With the files and the count number, the order of commands can't be changed. So, every workflow needs a separate record. I would like to have a more flexible way. For example creating a YAML file with the function name and the arguments, so the the recording can be reused. I know the file could get big.

Yes, I believe this is way, way, beyond the goals of NAPALM - as @dbarrosop likes to say everytime, "napalm is a library, it's not a tool.", so this implementation, from any perspective, I believe it doesn't really fit into this definition.

dbarrosop commented 6 years ago

We should not create the real object in 'replay' mode

I agree. That's certainly an easy bug to fix.

With the files and the count number, the order of commands can't be changed

That's the point. This is a recorder/replayer.

I kinda like the idea, but if I'd compare the benefits vs. complexity + risks and environment exposure, I find there's not much real value added.

Complexity, less than what we have today. Benefits of having this between each driver and the underlying class (netmiko, pyeapi, etc):

  1. Test cases are easily recorded. Just run the test against a real device in recording mode and you are done.
  2. With these 60 lines of code, including whitespaces and comments, you can say goodbye to all the mocking code we have today, which is a lot and vendor dependent; MockDriver, FakeEOS, FakeJunos, Fakeblah, all gone.
  3. Debugging, something breaks? Just enable recording mode and check what's happening, the scenario can even be shared on an issue/branch/PR for us to debug and fix.
  4. Compliance, I am sure compliance people would like it.

An alternative to the "pass" behavior would be to call the class directly without the wrapper so the default behavior wouldn't change at all. So on the drivers we would have:

                if self.recorder_mode == "pass":
                    self.device = pyeapi.client.Node(connection, enablepwd=self.enablepwd)
                else:
                      self.device = napalm_base.recorder.Recorder(pyeapi.client.Node,
                                                               recorder_options=self.recorder_options,
                                                               connection=connection,
                                                               enablepwd=self.enablepwd)

Or something like that

itdependsnetworks commented 6 years ago

My take:

IMHO unless this introduced an unnecessary amount of work and future burden I see it as clear net positive.

ktbyers commented 6 years ago

@itdependsnetworks Can you elaborate on this statement?

Can use the cli to backup commands not in data model. e.g. have a requirement to log: sh hardware internal errors module 1 \| diff \| i crc

I think we should use probably use JSON or YAML to save data instead of pickle.

I like the general idea.

itdependsnetworks commented 6 years ago

I'm simply highlighting he fact that that command will never be part of any standard getters. It would be nice to have a standard way to say the raw response no matter what you were running

dbarrosop commented 6 years ago

I think we should use probably use JSON or YAML to save data instead of pickle.

Problem with YAML/JSON is that it becomes a problem when the method returns pythons objects. For example, a method might return None or True, an Exception or even a custom object, and those don't translate well to structured data. But I will do some tests and see what happens.

ktbyers commented 6 years ago

If we use Pickle, then we won't be able to accept data from anyone we don't trust...as they could totally mess us over (like embed system calls inside the pickle file) and we wouldn't be able to easily inspect what was sent to us beforehand.

Booleans and None are not really a problem (objects including Exceptions are a problem).

Does it break how you were planning on using it? I was thinking of it as more of capturing data that could be added to unit tests (as opposed to replacing mocked devices).

>>> a = None
>>> import json
>>> json.dumps(a)
'null'
>>> b = True
>>> json.dumps(a)
'null'
>>> json.dumps(b)
'true'
>>> print json.loads(json.dumps(a))
None
>>> print json.loads(json.dumps(b))
True
dbarrosop commented 6 years ago

then we won't be able to accept data from anyone we don't trust

As discussed via other channels. I think you are right and I will look into that. The main issue is that not all calls might return structured but some "bare" object like a string or an Exception but that can be treated.

mirceaulinic raised the concern that the recorder might break things if a method changes how things are retrieved

That is very much correct so we will do two things:

  1. Make clear in the documentation that the recorder is mostly for internal purposes and that they are free to use it but without guarantees.
  2. We will add some metadata when recording so if we break something we know how and when it was recorded. For example, we could add the date and all the napalm libraries installed with their version at the time of the recording.
coveralls commented 6 years ago

Coverage Status

Changes Unknown when pulling 507a46ebe847a84c203fc98910b822c053fa34ce on recorder into on develop.

dbarrosop commented 6 years ago

Regarding the metadata, now it adds a metadata.yaml file with the following contents:

(napalm) ➜  napalm-base git:(recorder) ✗ cat test_recorder/metadata.yaml
date: 2017-08-16 19:57:11.194889
napalm_version:
- napalm-ansible==0.7.0
- napalm-base==0.24.3
- napalm-eos==0.6.0
- napalm-fortios==0.4.0
- napalm-ios==0.7.0
- napalm-iosxr==0.5.4
- napalm-junos==0.12.0
- napalm-nxos==0.6.0
- napalm-panos==0.4.0
- napalm-pluribus==0.5.1
- napalm-ros==0.2.2
- napalm-vyos==0.1.3
- napalm==1.2.0
coveralls commented 6 years ago

Coverage Status

Changes Unknown when pulling b8d1ccbed710504fd80f5bbf8efacf9778d14e94 on recorder into on develop.

dbarrosop commented 6 years ago

@ktbyers @mirceaulinic latest update replaces pickle with Camel, which builds on top of pyyaml to provide a safe mechanism to serialize/deserialize custom objects.

I tried pyyaml pickle-like behavior but it turns out is as unsafe as pickle itself and also not very reliable as it fails to understand how the constructor of the classes work.

I recommend reading this if you have the time/energy:

https://eev.ee/blog/2015/10/15/dont-use-pickle-use-camel/

If you are happy with this I will remove all the testing stuff I added and merge. Then I will quickly integrate it with the napalm cli tool so users can start recording with it.

mirceaulinic commented 6 years ago

@dbarrosop - you may have missed my review.

I disagree on using an obscure library that gained less interest than naplam (i.e. https://github.com/eevee/camel has exactly 7 followers), rather than a very well known and widely adopted msgpack (also very well know for its speed).

Also, it's very bizzare that the sole resource I was able to find about this camel is the blog post you pointed out... written by its maintainer and one of the 4 contributors.

dbarrosop commented 6 years ago

I disagree on using an obscure library that gained less interest than naplam (i.e. https://github.com/eevee/camel has exactly 7 followers)

Are we measuring now usefulness by number of followers? Doesn't make much sense IMHO.

rather than a very well known and widely adopted msgpack (also very well know for its speed

I can write myself what camel does as it's just a framework on top of pyyaml, which seems unnecessary to write as someone else did it already, but as I mentioned in the other comment, msgpack is way overkill and doesn't do exactly what we want.

dbarrosop commented 6 years ago

As discussed in slack, will rewrite without dependencies.