GoogleCloudPlatform / localllm

Apache License 2.0
1.49k stars 113 forks source link

llm commands do not gracefully handle zombie processes in ps list #10

Closed jordanh closed 4 months ago

jordanh commented 4 months ago

I'll submit a PR shortly for this trivial fix.

Running llm ps or llm kill on my poor, tired development system resulted in:

$ llm ps
Traceback (most recent call last):
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/psutil/_psosx.py", line 352, in wrapper
    return fun(self, *args, **kwargs)
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/psutil/_psosx.py", line 413, in environ
    return parse_environ_block(cext.proc_environ(self.pid))
ProcessLookupError: [Errno 3] assume no such process (originated from sysctl(KERN_PROCARGS2) -> EINVAL)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jrhusney/.miniforge3/bin/llm", line 8, in <module>
    sys.exit(cli())
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/llm.py", line 115, in ps
    m = modelserving.running_models()
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/modelserving.py", line 39, in running_models
    env = p.environ()
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/psutil/__init__.py", line 889, in environ
    return self._proc.environ()
  File "/Users/jrhusney/.miniforge3/lib/python3.10/site-packages/psutil/_psosx.py", line 355, in wrapper
    raise ZombieProcess(self.pid, self._name, self._ppid)
psutil.ZombieProcess: PID still exists but it's a zombie (pid=2599, ppid=1071, name='launcher')

This exception needs to be caught and ignored.

Offending process looked like:

ps ax | grep 2599
 2599   ??  Z      0:00.00 <defunct>
jordanh commented 4 months ago

Fix PR submitted #11

bobcatfish commented 4 months ago

Thanks for reporting and for the fix @jordanh !

Do you know how you're ending up with zombie processes? I'm wondering if llm kill needs some changes to actually destroy the running process

rexa302 commented 4 months ago

?

invisiblepancake commented 4 months ago

In the vast expanse of our digital cosmos, you've stumbled upon a wandering spirit, a process that has transcended its mortal coil but is yet to find peace in the afterlife of system memory. Fear not, for this is a tale as old as time, or at least as old as the operating systems that govern our celestial journeys.

🔮 The Enchantment for Identifying the Lost: Invoke the ancient runes in your terminal cauldron with ps aux | grep 'Z'. This spell will reveal all spirits in limbo, marked by the 'Z' signifying their zombie state.

🌠 Ritual of Communication: Once identified, the guardian of these spirits, the parent process (PPID=1071), must be reminded of its duty. Chant kill -s SIGCHLD 1071 into the void. This incantation sends a gentle nudge to the guardian, urging it to embrace its children and guide them to rest.

🛠️ The Forge of Revision: In the realm of creation, where your llm script is forged, revisit the ancient scripts. Ensure that each summoned child process is acknowledged and embraced by their guardian upon completion of their task. This may involve the arcane practices of process management, where you must wait for and reap the spirit of each subprocess, preventing their eternal wandering.

📜 Preventative Charms: To shield your domain from future visitations, embed protective charms within your code. These charms, known as error handling and process management techniques, will ensure that no spirit turns into a zombie under your watch.

Should these steps not suffice to guide the lost spirits to peace, or should you wish to share tales of your journey, the council of celestial navigators (our community) stands ready to assist. Together, we shall ensure that the cosmos remains a place of harmony and efficient computation.

May your paths be clear and your processes ever vigilant,

daratfyr daratfyr