KimiNewt / pyshark

Python wrapper for tshark, allowing python packet parsing using wireshark dissectors
MIT License
2.2k stars 421 forks source link

Too many open files. #588

Open poleguy opened 2 years ago

poleguy commented 2 years ago

Describe the bug Pyshark if used repeatedly leaves open files (pipes?) and crashes within minutes. I have a complex use-case, but I have simplified it down to a minimally reproducible example, included below. My need is (roughly) to capture the 20 most recent packets coming from the interface, display a visualization of them to the user and then grab the next 20 most recent packets. Ideally this could be done without dropping any packets, but it needs to be the most recent 20 packets, so it must be set up to drop packets gracefully if necessary to not have stale data after grabbing the most recent 20 contiguous packets.

Traceback (most recent call last):
  File "/home/proton/dpsm_rx_hw_test/proton_pack/bug_report.py", line 39, in <module>
  File "/home/proton/dpsm_rx_hw_test/proton_pack/bug_report.py", line 29, in main
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/site-packages/pyshark/capture/capture.py", line 144, in load_packets
    self.apply_on_packets(
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/site-packages/pyshark/capture/capture.py", line 256, in apply_on_packets
    return self.eventloop.run_until_complete(coro)
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/site-packages/pyshark/capture/capture.py", line 265, in packets_from_tshark
    tshark_process = await self._get_tshark_process(packet_count=packet_count)
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/site-packages/pyshark/capture/live_capture.py", line 112, in _get_tshark_process
    tshark = await super(LiveCapture, self)._get_tshark_process(packet_count=packet_count, stdin=read)
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/site-packages/pyshark/capture/capture.py", line 346, in _get_tshark_process
    tshark_process = await asyncio.create_subprocess_exec(*parameters,
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/base_events.py", line 1676, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/proton/dpsm_rx_hw_test/cenv/lib/python3.9/subprocess.py", line 1720, in _execute_child
    errpipe_read, errpipe_write = os.pipe()
OSError: [Errno 24] Too many open files

To Reproduce Steps to reproduce the behavior: Run this code:

# test with
# watch 'lsof | grep python | wc -l'
import pyshark

def main(adapter ="enp11s0"):
    while True:
        print("start.")
        capture = pyshark.LiveCapture(interface=adapter)

        capture.sniff(packet_count=20, timeout=2)
        print(len(capture))

        # adding various things like this don't help:
        #capture.close()
        #del capture

if __name__ == "__main__":
    main()

Expected behavior It should run for days, weeks, years.

Versions (please complete the following information):

Example pcap / packet This is in a live capture use case, so there are packets continuously arriving. I don't think it matters what these packets are.

bivas6 commented 1 year ago

Hi @poleguy, I'm also experiencing this issue, do you have any solution? workaround?

Thanks

poleguy commented 1 year ago

I have no solution. However I used a workaround: I manually 'tail' the pcapng file:

I start a copture to pcapng on disk. In another process I open the pcapng using python and read packets until the very end of the file. I don't use a library to do this, I just decode the pcapng format directly. I keep track of the byte offset of the last packet read. Then I close the file. After a second I re-open the file, start at the byte offset where I left off and continue processing. Using the process I can close everything and ensure no memory leaks. It works reliably for live visualization of the data in the tail of the pcapng file.