jart / blink

tiniest x86-64-linux emulator
ISC License
6.94k stars 220 forks source link

`ptrace` support #56

Open trungnt2910 opened 1 year ago

trungnt2910 commented 1 year ago

This syscall is crucial for debugging support on blink, which may aid development in many scenarios.

While Linux ptrace is quite a unique call, it can be emulated using a technique called "cooperative debugging", used by the Darling project to emulate macOS ptrace without having to actually rely on the host's ptrace.

The approach uses an internal signal, which I believe blink already does.

johnothwolo commented 1 year ago

Doing this might be problematic on macOS because of entitlements needed to make the required mach system calls. I actually tried to implement it on Noah on macOS Catalina.

Noah had every process running its own virtual processor with its own virtual address space, using similar page translation methods as blink. So basically each process manages its own Linux process on a micro VM. The Linux syscall interrupts are captured and routed to macOS. The VM state is also copied when a process is forked.

My best approach was to manage every process's debugging state using a central debugging helper. Mach's vm_read/vm_write and mig would be used to tell the tracee to go into an infinite loop, to wait for messages and continue. MIG would be used to read the process's virtual memory.

That's just the Mac side of things. I imagine a cross platform solution would be extremely complicated.

trungnt2910 commented 1 year ago

entitlements needed to make the required mach system calls.

All the operations in the documentation attached in my original post involves only sending a runtime signal, so the only call required is kill.

The debuggee then sends the required information to the debugger through some kind of connection.

johnothwolo commented 1 year ago

I think DarlingHQ uses a kernel module to achieve some of its ptrace functionality. IIRC, Darling also emulates Darwin's bsd threads in the lkm. I don't know if @jart plans to touch any kernel APIs.

trungnt2910 commented 1 year ago

I think DarlingHQ uses a kernel module to achieve some of its ptrace functionality.

Wrong, at least since early 2022.

See: https://github.com/darlinghq/darling/issues/1093

Darling also emulates Darwin's bsd threads in the lkm.

Half-true, DarlingHQ emulates BSD threads in its darlingserver, a kernel emulator server running wholly on the userspace.

touch any kernel APIs.

Their ptrace emulation does not involve any kernel API. It is a technique that only involves signaling, inter-process communication, and co-operation of the debuggee (so using this technique a debugger running on blink cannot ptrace any non-blink processes on the host).

johnothwolo commented 1 year ago

Wrong, at least since early 2022.

See: darlinghq/darling#1093

Half-true, DarlingHQ emulates BSD threads in its darlingserver, a kernel emulator server running wholly on the userspace.

Well that's pretty impressive stuff, I never knew they moved threads to userspace.

I understand that many darling components were moved to userspace, but my guess is that Blink wouldn't want to rely on kernel APIs completely. If that's the case then 'Half-true' wouldn't be good enough. I'm not a core developer, so that guess is just an educated guess.

All the operations in the documentation attached in my original post involves only sending a runtime signal, so the only call required is kill.

According to DarlingHQ docs:

kill(SIGSTOP):

Send a RT signal to the debuggee that it should act as if SIGSTOP were sent to the process. We cannot send a real SIGSTOP, because then the debuggee couldn't provide/update register state to the debugger etc.

Firstly an RT signal is a realtime signal if I'm not wrong. You can't really send IPC/mach messages to a halted process.

Regardless, Blink could use CFMessagePortRef, NSPort, or MIG(the Mach interface generator) for macOS IPC (I haven't gone through blink code). The first two are the best options and won't require entitlements. I'm not entirely sure if MIG needs entitlements—however, any ptrace syscall invocation will need entitlements.

trungnt2910 commented 1 year ago

According to DarlingHQ docs:

kill(SIGSTOP):

Send a RT signal to the debuggee that it should act as if SIGSTOP were sent to the process. We cannot send a real SIGSTOP, because then the debuggee couldn't provide/update register state to the debugger etc.

(Emphasis mine)

You can't really send IPC/mach messages to a halted process.

The process is not actually halted (from the host machine's perspective), but the emulated code (in DarlingHQ's case, macOS code, in Blink's case, the Linux binary) should stop executing. Therefore, while the emulated binary seems to have stopped, background emulator tasks should still continue working.

however, any ptrace syscall invocation will need entitlements.

Like I said, we're not actually invoking ptrace of any kind on the host while using this technique. So blink should not need any entitlement.

johnothwolo commented 1 year ago

Like I said, we're not actually invoking ptrace of any kind on the host while using this technique. So blink should not need any entitlement.

Let me rephrase. On macOS you can halt a process without ptrace using task_suspend. However, any application that modifies another process's state at runtime is considered a debugger by Apple. This includes task_* syscalls, vm_* syscalls and (of course) ptrace, hence entitlements being a concern. I don't make the rules, Apple does.

Secondly:

The process is not actually halted (from the host machine's perspective), but the emulated code (in DarlingHQ's case, macOS code, in Blink's case, the Linux binary) should stop executing. Therefore, while the emulated binary seems to have stopped, background emulator tasks should still continue working.

Wine and DarlingHQ (probably blink too) are not emulators, they're compatibility layers. I like to refer to them as userspace hypervisors, even though that's probably wrong. They're like execution coordinators, kinda like ld-linux on linux or dyld on macOS.

Listen dude, I'm not here to argue. I'm here to list the points and possibilities in this issue (help solve the issue) and learn new things. This back and forth is counterproductive.

trungnt2910 commented 1 year ago

Wine and DarlingHQ [...] are not emulators, they're compatibility layers.

I am aware of WINE's name as well as the fact that DarlingHQ loads a macOS binary directly on a Linux process's address space and executes instructions directly on the host's CPU. However, to simplify things I (and the DarlingHQ project itself) use the word "emulation" and its other forms. The term is commonly understood as "instruction set emulation" but depending on the context it can also mean "system call emulation" or "\<insert something that needs compatibility> emulation".

probably blink too

FYI, blink is an emulator. Unlike "compatibility layers", which runs code directly on the host's CPU, blink interprets binaries that uses a few supported x86_64 instruction sets.

any application that modifies another process's state at runtime is considered a debugger by Apple.

Again, quoting the documentation:

Debugging support in Darling makes use of what we call "cooperative debugging". It means the code in the debuggee is aware it's being debugged and actively assists the process.

So, imagining a scenario when blink has ptrace implemented using this "cooperative debugging" technique. This hypothetical future version of blink is emulating a Linux debugger (for example, lldb), debugging another blink-emulated process, on a macOS host. This, in order, is what happens:

  1. The blink-emulated debugger issues a syscall to ptrace (Linux syscall 101), with the request PTRACE_ATTACH.
  2. blink detects this syscall through its JIT/interpreter (just like any other Linux syscalls), and transfer controls to the ptrace syscall emulation function.
  3. blink's syscall emulation function (hypothetically, SysPtrace) calls the POSIX function kill in macOS's libSystem to send a blink-reserved signal (according to the current README, it's SIGSYS) to the other blink process that is emulating the desired debuggee. SysPtrace also opens a UNIX socket (or any better IPC channel) for communication.
  4. The other blink process receives this SIGSYS and stops the process emulation. It parses the additional data sent along with the signal, somehow realizes that there is a request for it to be debugged and connects to the IPC channel.
  5. The first blink process notices that the debuggee has connected and returns from SysPtrace.
  6. The emulated Linux debugger sends further ptrace requests.
  7. SysPtrace sends these ptrace requests through the IPC channel.
  8. The second blink process handles all these ptrace messages by reading and/or modifying its own state. The modification should be restricted to blink-emulated memory, registers, and other program states managed by blink, and may not touch any macOS system internals.

Therefore, the hypothetical future blink binary should only need to be able to use kill and UNIX sockets. The debuggee's process state is cooperatively modified/reported by the second blink process.

This "cooperative debugging" carries one limitation that I mentioned before:

(so using this technique a debugger running on blink cannot ptrace any non-blink processes on the host).

Listen dude, I'm not here to argue. I'm here to list the points and possibilities in this issue

I'm not here to spark a debate either. I'm just clarifying my points for you, jart, other people who might help implement this in the future, and anyone who visits this discussion.

johnothwolo commented 1 year ago

I understand how cooperative debugging works. I understand what could be done to emulate the syscall ptrace.

Like I said I've tried it before; more specifically tried and failed. What I don't understand is how we ended up here. I mentioned a possibility of something and you strike it down as impossible. I state the reason why I say that, and you maintain your stance. And you bring in examples from darling on Linux when I'm talking about macOS.

Look, none of us knows everything here. However, I find your responses (or rebuttals) rather arrogant. I have a hard time believing you don't intend to spark a debate.

Regardless I'll maintain that there MAY BE A POSSIBLITY that entitlements will be needed. What if @jart decides to just halt the process and read it's memory, because that'd be easier? We wouldn't know would we? It's all conjecture what is required because we don't know what's going to be done.

trungnt2910 commented 1 year ago

There might have been a misunderstanding in this conversation.

Sure, I do acknowledge that if you directly use macOS APIs like your attempt described in the first comment, there may be a possibility that entitlements are needed.

I also acknowledge that the approach I mentioned in the original post is just a suggestion. jart may use it, use your approach on macOS, or use a totally different one.

The misunderstanding here might have been that I was suggesting a POSIX-only approach for blink, and that the project could implement something similar to what DarlingHQ docs said. When you mentioned "Doing this", you might have referred to your own attempt, which involves using macOS-specific privileged APIs, while I misunderstood it as the "cooperative debugging" approach itself (which only uses POSIX kill).

Apart from some off-topic comments about darling's kernel server here, all of my comments below compare blink to darling in a few aspects to note that it's possible for blink to follow the "cooperative debugging" approach, and not focusing on how darling works on Linux.

Again, I do acknowledge that translating ptrace to macOS-specific APIs is a possible approach on the macOS side, and using the approach you attempted will have limitations that you stated. Sorry for having misunderstood your previous comments.

jart commented 1 year ago

I'm still catching up on this thread. Wow a lot of lively discussion here. I'm so happy to see how passionate folks are about their visions for Blink's future. Please remember that we're all friends here, and that nothing is impossible. Please keep that in mind when writing your communications. Blink is very POSIX focused, but I don't want to be limited to POSIX. I'd be happy to see integration with Mach APIs happen if that makes Blink better. ptrace() is a powerful and cool Linux specific API that'd be great to have available on MacOS. The one Linux API that I think is even cooler is SECCOMP BPF, since I have real need of that, since it's what redbean uses to provide sandboxing on Linux. Debugging processes external to Blink has been less of a focus until now, since the motivation has been more geared towards improving the debuggability of things that run inside Blink.

johnothwolo commented 1 year ago

@trungnt2910 I never saw your reply. I also apologize for the misunderstanding on my part. Stooping low (argumentatively) to prove a point was and will always be counterproductive.

jart commented 1 year ago

One thing we discussed recently on Discord is that having ptrace could potentially pave the way for running GDB inside Blink. Obviously Blinkenlights is the native debugger experience that fully works inside and outside of Blink. But having GDB would be nice too. The canonical way to do GDB though would be to implement the GDB server protocol into Blink, so that GDB can connect via a TCP socket similar to what Qemu does.

trungnt2910 commented 1 year ago

The canonical way to do GDB though would be to implement the GDB server protocol into Blink

This is a very ad-hoc approach to provide debugging support for Blink. First, we have 1000 lines of code to bring strace support natively on Blink. Then, we would implement an entire protocol specific to GDB.

If someone wants to use, for example, LLDB instead, or use some framework-specific debuggers like the .NET Core Debugger (vsdbg), they would have to implement the same thing over and over again.