KatharaFramework / Kathara

A lightweight container-based network emulation system.
https://www.kathara.org/
GNU General Public License v3.0
452 stars 63 forks source link

DockerManager.disconnect_from_link(machine: Machine, link: Link) assumes that the arguments have already called .api_object.reload() #224

Closed Heysunk closed 1 year ago

Heysunk commented 1 year ago

Describe the bug

Calling DockerManager.disconnect_machine_from_link relies on the machine.api_object being up to date. If it has been modified via other functions (e.g. DockerMachine.connect_to_link()), the function may fail due to checks assuming that the object reflects the current state.

To Reproduce

    link = lab.new_link(name="tunnel-"+m1.name+"-"+m2.name)

    kManager.deploy_link(link)
    DockerMachine.connect_to_link(m1,link)
    m1.add_interface(link=link)

    DockerMachine.connect_to_link(m2,link)
    m2.add_interface(link=link)

    #link.api_object.reload()
    for k in list(link.machines.keys()):
        # link.machines[k].api_object.reload()
        # We can't call machine.remove_interface directly since it changes the dict we're iterating over
        kManager.disconnect_machine_from_link(link.machines[k],link)

Without calling the reloads on the .api_objects the disconnection will fail.

These are the relevant lines which perform the disconnection. Line 202 is the function containing the bug. https://github.com/KatharaFramework/Kathara/blob/56ba82a1a92d83c1affef5a9170c073abcd3334b/src/Kathara/manager/docker/DockerManager.py#L200-L203

The following check will fail if the network has been added without reload() having been called on the api_object. https://github.com/KatharaFramework/Kathara/blob/56ba82a1a92d83c1affef5a9170c073abcd3334b/src/Kathara/manager/docker/DockerMachine.py#L338-L342

Reload is called later on in: https://github.com/KatharaFramework/Kathara/blob/56ba82a1a92d83c1affef5a9170c073abcd3334b/src/Kathara/manager/docker/DockerLink.py#L133-L138

However, at this point the check in DockerMachine has already failed and the networks have not been removed from the device. At this point I assume docker doesn't remove the network since there are still machines attached to it, but I haven't verified this.

Expected behavior

I expect to be able to call disconnect_machine_from_link() without first having to ensure that the .api_object is up to date. OR, at the very least, ensure that this behaviour is intended via documentation.

So I suggest either ensuring that reload happens earlier so that checks relying on the state being current function correctly, or (less preferably) clarify this behaviour as intended via the documentation

tcaiazzi commented 1 year ago

Hi @Heysunk,

Thanks for opening the issue and for the detailed explanation!

I think you are right! The user should use the disconnect_machine_from_link() and connect_machine_to_link() without calling the reload() method on the api_object

We will add this change in the next release.

Many thanks, Tommaso

tcaiazzi commented 1 year ago

Hi @Heysunk,

I'm not able to reproduce the error from your snippet. Can you please provide me the entire script?

Thank you, Tommaso

Heysunk commented 1 year ago
from Kathara.model.Lab import Lab
from Kathara.model.Link import Link
from Kathara.model.Machine import Machine
from Kathara.manager.Kathara import Kathara
from Kathara.parser.netkit.LabParser import LabParser
from Kathara.parser.netkit.FolderParser import FolderParser
from Kathara import utils
from Kathara.cli.ui.event.register import register_cli_events
from Kathara.setting.Setting import Setting

kManager = Kathara.get_instance()

def startLab(lab_path, args):
    lab_path = utils.get_absolute_path(lab_path)

    try:
        lab = LabParser.parse(lab_path)
    except IOError as e:
        lab = FolderParser.parse(lab_path)

    lab.general_options["image"] = "kathara/frr"

    register_cli_events()

    kManager.deploy_lab(lab)
    return lab

def disconnectMachinesBug(link: Link):
    for k in list(link.machines.keys()):
        #link.machines[k].api_object.reload()
        # We can't call machine.remove_interface directly since it changes the dict we're iterating over
        kManager.disconnect_machine_from_link(link.machines[k],link)

def disconnectMachines(link: Link):
    for k in list(link.machines.keys()):
        link.machines[k].api_object.reload()
        # We can't call machine.remove_interface directly since it changes the dict we're iterating over
        kManager.disconnect_machine_from_link(link.machines[k],link)

def connectMachines(m1: Machine, m2: Machine, lab: Lab):
    link = lab.new_link(name="tunnel-"+m1.name+"-"+m2.name)

    #m1Eth = attachAndReturnInterfaceDocker(m1, link, lab)
    #m2Eth = attachAndReturnInterfaceDocker(m2, link, lab)
    kManager.connect_machine_to_link(m1, link)
    kManager.connect_machine_to_link(m2, link)

    ip1 = "192.168.1.200"
    ip2 = "192.168.1.201"
    command = "ip a add " + ip1 + "/255.255.255.0 dev eth1"
    output = kManager.exec(machine_name=m1.name,command=command,lab_hash=lab.hash)

    command = "ip a add " + ip2 + "/255.255.255.0 dev eth1"
    output = kManager.exec(machine_name=m2.name,command=command,lab_hash=lab.hash)

    return link

def connect(lab: Lab):

    m1 = lab.get_machine("r18")
    m2 = lab.get_machine("r19")

    return connectMachines(m1, m2, lab)

args ={}
Setting.get_instance().open_terminals = False

lab = startLab("testLab/lab/", args)

testLab.zip

Run the code from the same folder you unzip testLab.zip into.

$ python -i sampleBug.py

>>> link = connect(lab) From a separate terminal, cd to testLab/lab, and run kathara connect r19. In this terminal, run ping 192.168.1.201. Ping should work. >>> disconnectMachines(link) In the separate terminal the ping should stop responding.

Running kathara lclean, and then rerunning the program and connecting, but instead using the following function for disconnecting will showcase the bug. (Once added, links can't "officially" be removed from the lab, so creating another link with the same name will fail)

>>> disconnectMachinesBug(link)

Inspecting the link object before and after calling this function we can notice that it has been changed, e.g link.machines, but the machines are still connected as can be seen from the ping still responding.

Hopefully this should showcase the behaviour.

tcaiazzi commented 1 year ago

Hi @Heysunk,

I made the changes. You can try the new version on the branch: https://github.com/KatharaFramework/Kathara/tree/224-reload-api-object

The fix will be merged in the next release!

Many thanks for the precious help, Tommaso