exec command output truncated

zhangineer2 commented 2 years ago

Describe the bug When executing a list of commands in a docker container, sometimes the output returned is split and returned via 2 separate generator values

for example:

This is an expected output, where all ICMP result is returned in one generator

MY OUTPUT IS.....  b'PING h8 (10.0.0.8) 56(84) bytes of data.\n64 bytes from h8 (10.0.0.8): icmp_seq=1 ttl=64 time=0.092 ms\n\n--- h8 ping statistics ---\n1 packets transmitted, 1 received, 0% packet loss, time 0ms\nrtt min/avg/max/mdev = 0.092/0.092/0.092/0.000 ms\n'

This is an incorrect output, where the ICMP result is returned via 2 separate generators

MY OUTPUT IS.....  b'PING h9 (10.0.0.9) 56(84) bytes of data.\n64 bytes from h9 (10.0.0.9): icmp_seq=1 ttl=64 time=0.101 ms\n\n' 

MY OUTPUT IS.....  b'--- h9 ping statistics ---\n1 packets transmitted, 1 received, 0% packet loss, time 0ms\nrtt min/avg/max/mdev = 0.101/0.101/0.101/0.000 ms\n'

To Reproduce Steps to reproduce the behavior:

Setup the lab first
```
from Kathara.model.Lab import Lab
```

lab = Lab('test') hostNum = 10 for i in range (1,hostNum+1): host = lab.get_or_new_machine(f'h{i}') host.add_meta('image', 'kathara/host') host.startup_commands = [f'ifconfig eth0 10.0.0.{i}/16 up'] + \ [f'echo 10.0.0.{i} h{i} >> /etc/hosts' for i in range (1,hostNum+1)] lab.connect_machine_to_link(host.name, "A")

Kathara.get_instance().wipe() Kathara.get_instance().deploy_lab(lab)


2. Execute a list of commands

from Kathara.manager.Kathara import Kathara hostNum=10 kathara_manager = Kathara() machines = next(kathara_manager.get_lab_info()) lab_hash = machines[0]["real_lab_hash"] cmds = ";".join([f"ping -c1 h{i}" for i in range(1,hostNum+1)]) cmd = f'/bin/bash -c "{cmds}"' for machine in machines: exec_output = kathara_manager.exec(lab_hash, machine['name'],cmd) print (next(exec_output))


**Expected behavior**
This is the output I expect to see:

(b'PING h1 (10.0.0.1) 56(84) bytes of data.\n64 bytes from h1 (10.0.0.1): icmp_seq=1 ttl=64 time=0.095 ms\n\n--- h1 ping statistics ---\n1 packets transmitted, 1 received, 0% packet loss, time 0ms\nrtt min/avg/max/mdev = 0.095/0.095/0.095/0.000 ms\n', None)


but sometimes I see this truncated output:

(b'PING h1 (10.0.0.1) 56(84) bytes of data.\n64 bytes from h1 (10.0.0.1): icmp_seq=1 ttl=64 time=0.080 ms\n\n', None)



<details>
<summary>"check" Command Output</summary>
Add the output of `kathara check`
</details>

zhangineer2 commented 2 years ago

Upon further debugging, I noticed that sometimes the returned results have an extra blank line as shown below, which maybe the cause

PING h10 (10.0.0.10) 56(84) bytes of data.
64 bytes from h10 (10.0.0.10): icmp_seq=1 ttl=64 time=0.081 ms

--- h10 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.081/0.081/0.081/0.000 ms

PING h1 (10.0.0.1) 56(84) bytes of data.
64 bytes from h1 (10.0.0.1): icmp_seq=1 ttl=64 time=0.064 ms

--- h1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.064/0.064/0.064/0.000 ms

PING h2 (10.0.0.2) 56(84) bytes of data.
64 bytes from h2 (10.0.0.2): icmp_seq=1 ttl=64 time=0.056 ms

--- h2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.056/0.056/0.056/0.000 ms

tcaiazzi commented 2 years ago

Hi @zhangineer, The issue that you have is related to how docker-py manages streams. In fact, the exec_run method of docker-py returns a new tuple from the generator at each new line.

However, it is possible to get all the output of a command iterating through the generator, consuming it.

hostNum=10
kathara_manager = Kathara.get_instance()
machines = next(kathara_manager.get_lab_info())
lab_hash = machines[0]["real_lab_hash"]
cmds = ";".join([f"ping -c1 h{i}" for i in range(1,hostNum+1)])
cmd = f'/bin/bash -c "{cmds}"'
print(f"Executing command: {cmd}")
for machine in machines:
    exec_output = kathara_manager.exec(lab_hash, machine['name'],cmd)
    output = ""
    while True: 
        try:
            std_out, std_err = next(exec_output)
            output += std_out.decode('utf-8')
        except StopIteration:
            break
    print(f"----------{machine['name']} Commands------------")
    print(output)

Let me know if this solve your issue!

Thanks, Tommaso

zhangineer2 commented 2 years ago

Hi @tcaiazzi I understand that the generator returns tuple at each newline. However, for some reason, there are newlines being added when it shouldn't. in my example

MY OUTPUT IS.....  b'PING h8 (10.0.0.8) 56(84) bytes of data.\n64 bytes from h8 (10.0.0.8): icmp_seq=1 ttl=64 time=0.092 ms\n\n--- h8 ping statistics ---\n1 packets transmitted, 1 received, 0% packet loss, time 0ms\nrtt min/avg/max/mdev = 0.092/0.092/0.092/0.000 ms\n'

MY OUTPUT IS.....  b'PING h9 (10.0.0.9) 56(84) bytes of data.\n64 bytes from h9 (10.0.0.9): icmp_seq=1 ttl=64 time=0.101 ms\n\n' 
MY OUTPUT IS.....  b'--- h9 ping statistics ---\n1 packets transmitted, 1 received, 0% packet loss, time 0ms\nrtt min/avg/max/mdev = 0.101/0.101/0.101/0.000 ms\n'

The above is exactly the same output, except for the IP addresses, yet the format returned is different. The former returned in the correct format, and the latter was split. This is a single ping command, there can't possibly be any newlines within each output.

tcaiazzi commented 2 years ago

Hi @zhangineer, I understand your issue. However, as I said before, this is related on how docker-py manages streams. We only return the generator from docker-py exec_run command.

Check the code here

In my snippet the while loop merges outputs for a command to reconstruct the whole output. You can use something like this as a workaround.

Thanks, Tommaso.

zhangineer2 commented 2 years ago

Thank you @tcaiazzi for the workaround

zhangineer2 commented 2 years ago

Hi @tcaiazzi,

Upon further testing, I noticed that the output format is still malformed when using the exec method. This seems to be due to exec returning a tuple with stdout and stderr, which sometimes do not actually return expected results.

So what I did was to completely bypass the exec command and run container.exec_run directly within my own code. Huge thank you for pointing me to the source of exec command.

Here is my code snippet for re-implementing this:

class LabInfo:

    def __init__(self):
        """
        Lab hash is automatically pulled during initializations
        """
        self.kathara_manager = Kathara()
        self.machines = next(self.kathara_manager.get_lab_info())
        self.lab_hash = self.machines[0]["real_lab_hash"]
        self.client = docker.from_env()

    def get_machine_api_object(self, lab_hash: str, machine_name: str,
                               user: str = None) -> docker.models.containers.Container:
        containers = self.get_machines_api_objects_by_filters(lab_hash=lab_hash, machine_name=machine_name, user=user)

    def get_machines_api_objects_by_filters(self, lab_hash: str = None, machine_name: str = None, user: str = None) -> \
            List[docker.models.containers.Container]:
        filters = {"label": ["app=kathara"]}
        if user:
            filters["label"].append("user=%s" % user)
        if lab_hash:
            filters["label"].append("lab_hash=%s" % lab_hash)
        if machine_name:
            filters["label"].append("name=%s" % machine_name)

    def exec_cmd(self, machine_name: str, cmd: str):
        """
        execute a command on a given machine
        """
        container = self.get_machine_api_object(lab_hash=self.lab_hash, machine_name=machine_name)
        return container.exec_run(cmd)

Then my main execution simply becomes the following:

    running_lab = LabInfo()
    exec_output = running_lab.exec_cmd(src, cmd)
    print (exec_output.output.decode('utf-8'))

KatharaFramework / Kathara

exec command output truncated #161