kytos-ng / maintenance

Kytos Maintenance Window NApp
https://kytos-ng.github.io/api/maintenance.html
0 stars 7 forks source link

Add in persistence to maintenance windows #64

Closed Ktmi closed 1 year ago

Ktmi commented 1 year ago

Fix #49 Fix #48

This PR seeks to resolve #49, implementing persistence to maintenance windows using mongo db. Upon adding a maintenance window, it will automatically be added to the database and scheduler. If kytos shuts down, then restarts, the napp should automatically attempt to re-dispatch any running maintenances, and reschedule any pending maintenance.

Notes

This PR is dependent on kytos-ng/kytos#308, for making the database accessible through the controller. Also the requirements files may need to be updated in order to use that PR.

viniarck commented 1 year ago

@Ktmi , I'm doing partial reviews to try to avoid the PR to stall in review (specially as we're getting close to the release getting finished) as I get notifications from a few commits being pushed, but if you prefer only a next review when you're done pushing the commits that also works, let me know if you have a preference. It's probably also better if unit tests and linters are passing.

Ktmi commented 1 year ago

I think I'm going to need a little help understanding the javascript for edit window. I can get it to properly parse the new window structure, however, the request that it produces for updating the window ends up malformed.

Here's what I have so far: https://gist.github.com/Ktmi/69599c5bbf736b6a6baa19a9a8a17df4

Here's an example of a malformed request:

{
    "description": "",
    "end": "2022-12-30T13:00:00+0000",
    "id": "Test topo",
    "interfaces": [
        {
            "description": "00:00:00:00:00:00:00:01:1",
            "value": "00:00:00:00:00:00:00:01:1"
        },
        {
            "description": "00:00:00:00:00:00:00:01:2",
            "value": "00:00:00:00:00:00:00:01:2"
        },
        {
            "description": "00:00:00:00:00:00:00:01:4294967294",
            "value": "00:00:00:00:00:00:00:01:4294967294"
        },
        {
            "description": "00:00:00:00:00:00:00:02:1",
            "value": "00:00:00:00:00:00:00:02:1"
        },
        {
            "description": "00:00:00:00:00:00:00:02:2",
            "value": "00:00:00:00:00:00:00:02:2"
        },
        {
            "description": "00:00:00:00:00:00:00:02:4294967294",
            "value": "00:00:00:00:00:00:00:02:4294967294"
        },
        "00:00:00:00:00:00:00:01:1",
        "00:00:00:00:00:00:00:01:2",
        "00:00:00:00:00:00:00:01:4294967294",
        "00:00:00:00:00:00:00:02:1",
        "00:00:00:00:00:00:00:02:2",
        "00:00:00:00:00:00:00:02:4294967294"
    ],
    "links": [],
    "start": "2022-12-30T12:00:00+0000",
    "switches": [
        {
            "description": "00:00:00:00:00:00:00:01",
            "value": "00:00:00:00:00:00:00:01"
        },
        {
            "description": "00:00:00:00:00:00:00:02",
            "value": "00:00:00:00:00:00:00:02"
        },
        "00:00:00:00:00:00:00:01",
        "00:00:00:00:00:00:00:02"
    ]
}
viniarck commented 1 year ago

I think I'm going to need a little help understanding the javascript for edit window. I can get it to properly parse the new window structure, however, the request that it produces for updating the window ends up malformed.

Here's what I have so far: https://gist.github.com/Ktmi/69599c5bbf736b6a6baa19a9a8a17df4

Here's an example of a malformed request:

{
  "description": "",
  "end": "2022-12-30T13:00:00+0000",
  "id": "Test topo",
  "interfaces": [
      {
          "description": "00:00:00:00:00:00:00:01:1",
          "value": "00:00:00:00:00:00:00:01:1"
      },
      {
          "description": "00:00:00:00:00:00:00:01:2",
          "value": "00:00:00:00:00:00:00:01:2"
      },
      {
          "description": "00:00:00:00:00:00:00:01:4294967294",
          "value": "00:00:00:00:00:00:00:01:4294967294"
      },
      {
          "description": "00:00:00:00:00:00:00:02:1",
          "value": "00:00:00:00:00:00:00:02:1"
      },
      {
          "description": "00:00:00:00:00:00:00:02:2",
          "value": "00:00:00:00:00:00:00:02:2"
      },
      {
          "description": "00:00:00:00:00:00:00:02:4294967294",
          "value": "00:00:00:00:00:00:00:02:4294967294"
      },
      "00:00:00:00:00:00:00:01:1",
      "00:00:00:00:00:00:00:01:2",
      "00:00:00:00:00:00:00:01:4294967294",
      "00:00:00:00:00:00:00:02:1",
      "00:00:00:00:00:00:00:02:2",
      "00:00:00:00:00:00:00:02:4294967294"
  ],
  "links": [],
  "start": "2022-12-30T12:00:00+0000",
  "switches": [
      {
          "description": "00:00:00:00:00:00:00:01",
          "value": "00:00:00:00:00:00:00:01"
      },
      {
          "description": "00:00:00:00:00:00:00:02",
          "value": "00:00:00:00:00:00:00:02"
      },
      "00:00:00:00:00:00:00:01",
      "00:00:00:00:00:00:00:02"
  ]
}

@Ktmi this patch is in the right direction. I suspect the issue is related how the this.chosen_* synced vars are being re-used in the request payload. It used to build another list jsonItems based on the chosen items.

Notice here for instance, that if you don't select anything it'll send the chosen contents which will result in an invalid payload:

20221220_164122

If you were simulate JS click events to update the content of the chosen items it would succeed:

20221220_164212

Ktmi commented 1 year ago

I think at this point most of the issues with this PR should be resolved. @viniarck could you please do a final review and make sure I properly addressed your concerns?

Ktmi commented 1 year ago

@viniarck At this point I'm trying to hammer out the end to end tests. All I have left is 2 tests that aren't passing and I don't necessarily understand why. Could you give me a bit of insight as to why? Here is the results from running the tests:

Starting enhanced syslogd: rsyslogd.
/etc/openvswitch/conf.db does not exist ... (warning).
Creating empty database /etc/openvswitch/conf.db.
Starting ovsdb-server.
Configuring Open vSwitch system IDs.
Starting ovs-vswitchd.
Enabling remote OVSDB managers.
Trying to run hello command on MongoDB...
Trying to run 'hello' command on MongoDB...
Ran 'hello' command on MongoDB successfully. It's ready!
============================= test session starts ==============================
platform linux -- Python 3.9.2, pytest-7.2.0, pluggy-1.0.0
rootdir: /tests
plugins: rerunfailures-10.2, timeout-2.1.0
collected 24 items

tests/test_e2e_50_maintenance.py ............F......F....                [100%]

=================================== FAILURES ===================================
_____ TestE2EMaintenance.test_065_patch_mw_on_switch_new_start_delaying_mw _____

self = <tests.test_e2e_50_maintenance.TestE2EMaintenance object at 0x7fc4f49428e0>

    def test_065_patch_mw_on_switch_new_start_delaying_mw(self):
        """Tests the maintenance window data update
        process through MW's ID focused on the start time
        Test:
            /api/kytos/maintenance/v1 on POST,
            /api/kytos/maintenance/v1/{mw_id} on GET, and
            /api/kytos/maintenance/v1/{mw_id} on PATCH
        """
        self.restart_and_create_circuit()

        # Sets up the maintenance window information
        mw_start_delay = 30
        mw_duration = 90
        start = datetime.now() + timedelta(seconds=mw_start_delay)
        end = start + timedelta(seconds=mw_duration)

        # Sets up the maintenance window data
        payload = {
            "description": "mw for test 065",
            "start": start.strftime(TIME_FMT),
            "end": end.strftime(TIME_FMT),
            "switches": [
                "00:00:00:00:00:00:00:02"
            ]
        }

        # Creates a new maintenance window
        api_url = KYTOS_API + '/maintenance/v1'
        response = requests.post(api_url, data=json.dumps(payload), headers={'Content-type': 'application/json'})
        assert response.status_code == 201, response.text
        data = response.json()
        assert 'mw_id' in data

        # Extracts the maintenance window id from the JSON structure
        assert len(data) == 1
        mw_id = data["mw_id"]

        # Gets the maintenance schema
        api_url = KYTOS_API + '/maintenance/v1/' + mw_id
        response = requests.get(api_url)
        assert response.status_code == 200, response.text
        json_data = response.json()
        assert json_data['id'] == mw_id

        # Sets up a new maintenance window data
        new_start = start + timedelta(seconds=mw_start_delay)
        payload1 = {
            "start": new_start.strftime(TIME_FMT),
            "end": end.strftime(TIME_FMT),
            "switches": [
                "00:00:00:00:00:00:00:02"
            ]
        }

        # Updates the maintenance window information
        mw_api_url = KYTOS_API + '/maintenance/v1/' + mw_id
        request = requests.patch(mw_api_url, data=json.dumps(payload1), headers={'Content-type': 'application/json'})
        assert request.status_code == 200

        # Gets the maintenance window schema
        api_url = KYTOS_API + '/maintenance/v1/' + mw_id
        response = requests.get(api_url)
        json_data = response.json()
        assert json_data['start'] == new_start.strftime(TIME_FMT)

        # Waits for the initial MW begin time
        # (no MW, it has been changed)
        time.sleep(mw_start_delay + 5)

        s2 = self.net.net.get('s2')
        h11, h3 = self.net.net.get('h11', 'h3')
        h11.cmd('ip link add link %s name vlan100 type vlan id 100' % (h11.intfNames()[0]))
        h11.cmd('ip link set up vlan100')
        h11.cmd('ip addr add 100.0.0.11/24 dev vlan100')
        h3.cmd('ip link add link %s name vlan100 type vlan id 100' % (h3.intfNames()[0]))
        h3.cmd('ip link set up vlan100')
        h3.cmd('ip addr add 100.0.0.2/24 dev vlan100')

        # Verifies the flow at the initial MW time
        # (no maintenance at that time, it has been delayed)
        flows_s2 = s2.dpctl('dump-flows')
>       assert len(flows_s2.split('\r\n ')) == BASIC_FLOWS + 2, flows_s2
E       AssertionError:  cookie=0xac00000000000002, duration=57.752s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:03 actions=CONTROLLER:65535
E          cookie=0xac00000000000002, duration=57.746s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:01 actions=CONTROLLER:65535
E          cookie=0xab00000000000002, duration=58.459s, table=0, n_packets=40, n_bytes=1680, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535
E         
E       assert 3 == (3 + 2)
E        +  where 3 = len([' cookie=0xac00000000000002, duration=57.752s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:...=58.459s, table=0, n_packets=40, n_bytes=1680, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535\r\n'])
E        +    where [' cookie=0xac00000000000002, duration=57.752s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:...=58.459s, table=0, n_packets=40, n_bytes=1680, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535\r\n'] = <built-in method split of str object at 0x7fc4f0f3fdc0>('\r\n ')
E        +      where <built-in method split of str object at 0x7fc4f0f3fdc0> = ' cookie=0xac00000000000002, duration=57.752s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:0...n=58.459s, table=0, n_packets=40, n_bytes=1680, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535\r\n'.split

tests/test_e2e_50_maintenance.py:733: AssertionError
___________ TestE2EMaintenance.test_100_extend_running_mw_on_switch ____________

self = <tests.test_e2e_50_maintenance.TestE2EMaintenance object at 0x7fc4f494f250>

    def test_100_extend_running_mw_on_switch(self):

        self.restart_and_create_circuit()

        # Sets up the maintenance window information
        mw_start_delay = 30
        mw_duration = 60
        mw_extension = 1
        start = datetime.now() + timedelta(seconds=mw_start_delay)
        end = start + timedelta(seconds=mw_duration)

        # Sets up the maintenance window data
        payload = {
            "description": "mw for test 100",
            "start": start.strftime(TIME_FMT),
            "end": end.strftime(TIME_FMT),
            "switches": [
                "00:00:00:00:00:00:00:02"
            ]
        }

        # Creates a new maintenance window
        api_url = KYTOS_API + '/maintenance/v1'
        response = requests.post(api_url, data=json.dumps(payload), headers={'Content-type': 'application/json'})
        data = response.json()

        # Extracts the maintenance window id from the JSON structure
        mw_id = data["mw_id"]

        # Gets the maintenance schema
        api_url = KYTOS_API + '/maintenance/v1/' + mw_id
        response = requests.get(api_url)
        assert response.status_code == 200, response.text
        json_data = response.json()
        assert json_data['id'] == mw_id

        # Waits for the MW to start
        time.sleep(mw_start_delay + 5)

        # Verifies the flow behavior during the maintenance
        s2 = self.net.net.get('s2')
        flows_s2 = s2.dpctl('dump-flows')
        assert 'dl_vlan=100' not in flows_s2
        assert len(flows_s2.split('\r\n ')) == BASIC_FLOWS, flows_s2

        # Checks connectivity during maintenance
        h11, h3 = self.net.net.get('h11', 'h3')
        h11.cmd('ip link add link %s name vlan100 type vlan id 100' % (h11.intfNames()[0]))
        h11.cmd('ip link set up vlan100')
        h11.cmd('ip addr add 100.0.0.11/24 dev vlan100')
        h3.cmd('ip link add link %s name vlan100 type vlan id 100' % (h3.intfNames()[0]))
        h3.cmd('ip link set up vlan100')
        h3.cmd('ip addr add 100.0.0.2/24 dev vlan100')
        result = h11.cmd('ping -c1 100.0.0.2')
        assert ', 0% packet loss,' in result

        payload2 = {'minutes': mw_extension}

        # extend the maintenance window information
        api_url = KYTOS_API + '/maintenance/v1/' + mw_id + '/extend'
        response = requests.patch(api_url, data=json.dumps(payload2), headers={'Content-type': 'application/json'})
        assert response.status_code == 200, response.text

        # Waits to the time that the MW should be ended but instead will be running (extended)
        time.sleep(mw_duration + 5)

        # Verifies the flow behavior during the maintenance
        s2 = self.net.net.get('s2')
        flows_s2 = s2.dpctl('dump-flows')
        assert 'dl_vlan=100' not in flows_s2
>       assert len(flows_s2.split('\r\n ')) == BASIC_FLOWS, flows_s2
E       AssertionError:  cookie=0xac00000000000002, duration=122.910s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:03 actions=CONTROLLER:65535
E          cookie=0xac00000000000002, duration=122.874s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:01 actions=CONTROLLER:65535
E          cookie=0xaa7261a30347de40, duration=8.656s, table=0, n_packets=0, n_bytes=0, priority=20000,in_port="s2-eth2",dl_vlan=2 actions=mod_vlan_vid:2,output:"s2-eth3"
E          cookie=0xaa7261a30347de40, duration=8.647s, table=0, n_packets=1, n_bytes=78, priority=20000,in_port="s2-eth3",dl_vlan=2 actions=mod_vlan_vid:2,output:"s2-eth2"
E          cookie=0xab00000000000002, duration=123.679s, table=0, n_packets=82, n_bytes=3444, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535
E         
E       assert 5 == 3
E        +  where 5 = len([' cookie=0xac00000000000002, duration=122.910s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee...123.679s, table=0, n_packets=82, n_bytes=3444, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535\r\n'])
E        +    where [' cookie=0xac00000000000002, duration=122.910s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee...123.679s, table=0, n_packets=82, n_bytes=3444, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535\r\n'] = <built-in method split of str object at 0x149c620>('\r\n ')
E        +      where <built-in method split of str object at 0x149c620> = ' cookie=0xac00000000000002, duration=122.910s, table=0, n_packets=0, n_bytes=0, priority=50000,dl_src=ee:ee:ee:ee:ee:...=123.679s, table=0, n_packets=82, n_bytes=3444, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535\r\n'.split

tests/test_e2e_50_maintenance.py:1076: AssertionError
=============================== warnings summary ===============================
test_e2e_50_maintenance.py: 17 warnings
  /usr/lib/python3/dist-packages/mininet/node.py:1121: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    return ( StrictVersion( cls.OVSVersion ) <

test_e2e_50_maintenance.py: 17 warnings
  /usr/lib/python3/dist-packages/mininet/node.py:1122: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    StrictVersion( '1.10' ) )

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
------------------------------- start/stop times -------------------------------
test_e2e_50_maintenance.py::TestE2EMaintenance::test_065_patch_mw_on_switch_new_start_delaying_mw: 2023-01-04,17:22:59.587115 - 2023-01-04,17:24:14.794649
test_e2e_50_maintenance.py::TestE2EMaintenance::test_100_extend_running_mw_on_switch: 2023-01-04,17:30:35.551109 - 2023-01-04,17:32:55.960302
=========================== short test summary info ============================
FAILED tests/test_e2e_50_maintenance.py::TestE2EMaintenance::test_065_patch_mw_on_switch_new_start_delaying_mw
FAILED tests/test_e2e_50_maintenance.py::TestE2EMaintenance::test_100_extend_running_mw_on_switch
============ 2 failed, 22 passed, 34 warnings in 1503.82s (0:25:03) ============

Both failures seem to link back to the amount of flows on the switch not matching the expected amount, but beyond that I don't understand the issue.

Ktmi commented 1 year ago

I was able to figure out the issue with the end to end tests. APScheduler doesn't reschedule tasks when the trigger for a job is updated. This made it so that tests for rescheduling didn't work.

italovalcy commented 1 year ago

I've executed a new end-to-end test for the specific tests of the maintenance Napp and everything looks good:

+ python3 -m pytest tests/test_e2e_50_maintenance.py --reruns 2 -r fEr
================================================================================================= test session starts =================================================================================================
platform linux -- Python 3.9.2, pytest-7.2.0, pluggy-1.0.0
rootdir: /kytos-end-to-end-tests
plugins: rerunfailures-10.2, timeout-2.1.0
collected 24 items

tests/test_e2e_50_maintenance.py ........................                                                                                                                                                       [100%]

================================================================================================== warnings summary ===================================================================================================
tests/test_e2e_50_maintenance.py: 17 warnings
  /usr/lib/python3/dist-packages/mininet/node.py:1121: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    return ( StrictVersion( cls.OVSVersion ) <

tests/test_e2e_50_maintenance.py: 17 warnings
  /usr/lib/python3/dist-packages/mininet/node.py:1122: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    StrictVersion( '1.10' ) )

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
-------------------------------------------------------------------------------------------------- start/stop times ---------------------------------------------------------------------------------------------------
==================================================================================== 24 passed, 34 warnings in 1706.57s (0:28:26) =====================================================================================