Ability to obtain current root process ID for a task

vvuk commented 3 weeks ago

A detailed description of the feature you would like to see added.

This was mentioned in #364 which was closed because the original issue author stopped needing it. I came to pueue via nushell, where it's a suggested way to work around nushell's lack of process management. And for the most part it's great, and far better than bash/zsh process management. However, I frequently just need the process ID of the background process I spawned.

I understand this isn't always a simple thing to provide; however, ultimately, there is a process that the pueue daemon launches for a task. That might be a shell if shell execute is used -- which I understand is supposed to be always, but it seems like this is skipped in some instances. When launching an app, I see the app process as a child of the daemon, without a shell in between. But either way, I'd like to easily obtain the process ID of whatever the thing is that the daemon launched, even if it's the shell. (Indeed I just tested -- if I include shell special characters like > then sh is launched)

Explain your usecase of the requested feature

Using pueue as a general rich background task manager. Sometimes the process ID is needed -- attaching a debugger/profiler, for example. It would be very convenient if this was easily available.

Alternatives

ps -ef | grep and then hunt for the process you want. Doable, but not pleasant.

Additional context

No response

Nukesor commented 3 weeks ago

Good points, that feature request sounds pretty reasonable.

It's definitely possible to do, however a few things need to be done first.

Specifically #548, otherwise the message handlers won't have access to the process handles.

Once the refactoring has taken place, the next steps are rather straight forward.

We have to think about a good name for the command. Maybe just a ps or a task subcommand?
It needs to be decided which process should be shown. My approach would be to show the direct child, which is the shell in most cases. The reason for this is that shells do weird stuff, are platform specific and users can now use custom shell commands, which makes it even harder to get this right. To give a few examples:
- sh -c 'echo "test"' actually get's rid of the sh parent process so that echo "test" as a direct subprocess of Pueue
- sh -c 'echo "test" && echo "another test" doesn't and there's always the sh parent.
- sh -c './some_script.sh' starts a single sh shell as parent process, which then processes some commands from that script.
- sh -c 'echo "test" && ./some_script.sh' starts a sh parent process, which then contains another sh process that executes commands from that script.

Some shells start futher subprocesses, some don't. It's basically up to the shell. Process handling in Windows is also completely different than in Linux and even though Mac process handling is similar, nobody is interested in writing a process handling library for it.

Due to these reasons and the sake of maintainability, I would strongly prefer to not do any shell specific shenannigans to determine the first process that's not a shell, as this is very error prone and it's easy to run into race conditions (e.g. looking up a subprocess when the parent just finished and is destroyed). The user can then use the shell's or root process's pid to do further processing as they like.

Nukesor commented 1 week ago

I finished the refactor in #547.

Nukesor commented 1 week ago

Anyhow, I would really like to get some feedback on my previous message regarding the design of the actual requested feature.

Is the feature even of any use to you, if you cannot tell whether you got a pid of a shell or a shell's subprocess?

Nukesor commented 2 days ago

Ping @vvuk

vvuk commented 2 days ago

Yep, will take a look today! Was out most of last week :)

vvuk commented 1 day ago

Whoops I missed the question, sorry -- got this confused with another notification.

Is the feature even of any use to you, if you cannot tell whether you got a pid of a shell or a shell's subprocess?

Yep! I can take that into account, and for the most part it's manual usage when I'd want the pid (e.g. the debugger attach case). I can take care to only launch things that won't create subshells or just deal with it; either way there would be a human in the loop vs. something automated being surprised to discover a shell/not-shell.

vvuk commented 1 day ago

Oh I didn't get it confused -- is the stack overflow issue still there? Happy to take a look and see if I can help.

Nukesor commented 11 hours ago

Thanks, I got the stack overflow issue fixed. It was a bad case of recursion accross 4 corners :D

So, I'm wondering how to best implement this. I'm thinking of adding a new pueue task function, which can be used to inspect a task.

By default, it would just print some info in a nice human-readable way. Something like:

$ pueue task show 1
Id: 1
Command: sleep 60 && ls -ahl | grep something
Original command: (only shown if pueue_aliases have been used)
Cwd: /tmp
Start: 2024-07-04 15:50:12
Group: default
State: Running
Root Pid: 50172
Label: special
Priority: 1000
Created at: 2024-07-04 15:49:43

pueue task show -j/--json would then show that info + environment variables as json output.

Would that work for you? You could then just extract the pid via pueue task show 1 | jq '.pid'

Nukesor / pueue