Starting frida might fail after device reset

zner0L commented 1 year ago

In the emulator example, the device is first ensured and then reset to a snapshot. After the reset, sometimes the frida server can not be started again in ensureFrida. On my linux machine, this sometimes fails with device offline (see detailed error message below), but it also works most of the time.

(node:20114) UnhandledPromiseRejectionWarning: Error: Command failed with exit code 1: adb shell "nohup /data/local/tmp/frida-server >/dev/null 2>&1 &"
adb: device offline
    at makeError (file:///home/zner0L/Programming/Activism/TrackingWeasel/appstraction/node_modules/execa/lib/error.js:59:11)
    at handlePromise (file:///home/zner0L/Programming/Activism/TrackingWeasel/appstraction/node_modules/execa/index.js:119:26)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
    at Object.ensureFrida (/home/zner0L/Programming/Activism/TrackingWeasel/appstraction/src/android.ts:54:13)
    at Object.ensureDevice (/home/zner0L/Programming/Activism/TrackingWeasel/appstraction/src/android.ts:89:9)
    at null.<anonymous> (/home/zner0L/Programming/Activism/TrackingWeasel/appstraction/examples/android-emulator.ts:17:5)

Funnily enough, now I cannot reproduce this anymore.

zner0L commented 1 year ago

I tried changing the emulator arguments a bit and realized that for resetting we need -writeable-system activated. We should mention this in the docs. Removing it produces a different error, however:

Error: Failed to load snapshot: KO: This is a disk-only snapshot. Revert to it offline using qemu-img.

baltpeter commented 1 year ago

We don't need -writeable-system anymore (and never did for appstraction). #27 implements the solution from https://github.com/tweaselORG/meta/issues/18#issuecomment-1437057934 also for emulators. That doesn't need a writable system.

zner0L commented 1 year ago

I now get this consistently in https://github.com/tweaselORG/cyanoacrylate, if I try to start an analysis with the frida capability:

(node:101910) UnhandledPromiseRejectionWarning: Error: Command failed with exit code 1: adb shell "nohup /data/local/tmp/frida-server >/dev/null 2>&1 &"
error: closed
    at makeError (file:///home/zner0L/Programming/Activism/TrackingWeasel/appstraction/node_modules/execa/lib/error.js:59:11)
    at handlePromise (file:///home/zner0L/Programming/Activism/TrackingWeasel/appstraction/node_modules/execa/index.js:119:26)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
    at Object.ensureFrida (/home/zner0L/Programming/Activism/TrackingWeasel/appstraction/dist/src/android.ts:54:13)
    at Object.ensureDevice (/home/zner0L/Programming/Activism/TrackingWeasel/appstraction/dist/src/android.ts:89:9)
    at null.<anonymous> (/home/zner0L/Programming/Activism/TrackingWeasel/cyanoacrylate/examples/test.tmp.ts:33:5)

baltpeter commented 1 year ago

We should start by getting rid of the nohup and >/dev/null 2>&1 &, that's really only useful for interactively running Frida. Here, we very much do want to see output (especially) errors.

zner0L commented 1 year ago

This doesn't really change much. It is really weird, if I use the command in the console directly, it nevers runs into any problems. But when I start it from cyanoacrylate, I always get this error or device offline. This trikes me as some kind of privilegde problem or something. Do you have any idea?

baltpeter commented 1 year ago

You've changed it to await execa('adb', ['shell', '/data/local/tmp/frida-server']);?

Are you using an emulator or a physical device? I have witnessed broken snapshots in the emulator unfortunately many times.

But that shouldn't give you device offline, especially not right after awaitAdb().

The device offline thing seems like a timing problem. Try inserting a pause(1000) before the Frida call to check that.

As for the other thing, not sure.

baltpeter commented 1 year ago

I'm currently doing some fairly unimportant iOS reverse-engineering. Should I continue with that or rather look into this? I've also noticed the Frida starting being flaky.

zner0L commented 1 year ago

I think this is more important tbh.

baltpeter commented 1 year ago

You've changed it to await execa('adb', ['shell', '/data/local/tmp/frida-server']);?

Oh, wait. That's not a good idea, that will wait forever for the Frida process to terminate…

You could not await it, but then we have the same problem as with the Objection process (we're leaking that, and the program will never exit by itself, you need to Ctrl+C).

I just noticed that frida-server has a --daemonize option, which sounds like what we're looking for:

  -D, --daemonize                       Detach and become a daemon

But while that does detach when I'm inside an adb shell, adb shell /data/local/tmp/frida-server -D doesn't detach for me.

baltpeter commented 1 year ago

This would work:

const proc = execa('adb', ['shell', '/data/local/tmp/frida-server --daemonize'], { detached: true });
proc.unref();

But it isn't exactly great either. While appstraction won't wait for the adb shell process anymore, we're still leaking it.

baltpeter commented 1 year ago

This is really odd. With await execa('adb shell "nohup /data/local/tmp/frida-server >/dev/null 2>&1 &"', { shell: true });, everything works fine for right now. But almost everything else I've tried (even just await execa('adb shell "nohup /data/local/tmp/frida-server &"', { shell: true });) tends to hang forever in ensureFrida().

baltpeter commented 1 year ago

OK, after an adb reboot, I have reproduced the device offline problem once. Re-running immediately afterwards did work.

baltpeter commented 1 year ago

This would work:
const proc = execa('adb', ['shell', '/data/local/tmp/frida-server --daemonize'], { detached: true });
proc.unref();
But it isn't exactly great either. While appstraction won't wait for the adb shell process anymore, we're still leaking it.

I have missed the obvious and clean way. :D

We use const proc = execa('adb', ['shell', '/data/local/tmp/frida-server', '--daemonize']);, and then after the fridaIsStarted check, we proc.kill(); (this only applies to the adb shell process, Frida will still continue to run since we daemonized it).

baltpeter commented 1 year ago

But now it sometimes fails to start Frida (the timeout runs out). Re-trying again immediately after does work. sigh

I guess we'll have to build in retries?

baltpeter commented 1 year ago

We use const proc = execa('adb', ['shell', '/data/local/tmp/frida-server', '--daemonize']);, and then after the fridaIsStarted check, we proc.kill(); (this only applies to the adb shell process, Frida will still continue to run since we daemonized it).

Ugh. The problem with that is that (for some reason…), the proc.kill() sometimes also kills the agent, which results in this error if any Frida function is called:

node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: Unable to connect to remote frida-server]

baltpeter commented 1 year ago

But while that does detach when I'm inside an adb shell, adb shell /data/local/tmp/frida-server -D doesn't detach for me.

I have found a way to get that to detach, after all: -x (await execa('adb', ['shell', '-x', '/data/local/tmp/frida-server', '--daemonize']);).

According to the help:

-x: disable remote exit codes and stdout/stderr separation

baltpeter commented 1 year ago

Great, now I broke my emulator snapshot. That happens way too often. Why are they so fragile? :(

baltpeter commented 1 year ago

@zner0L I've added a commit (a4b0b8b0b164135538541dff02b9e4d84a60dca6) to #42 that uses p-retry to retry starting Frida if it fails. Does that solve the problem for you?

zner0L commented 1 year ago

Nice! That seems to have fixed it. I tested it by running ensureDevice and resetDevice very quickly after one another which previously caused the problem to occur, but now it seems to have vanished. Thanks.

zner0L commented 1 year ago

Ok, frida doesn't give me any troubles anymore, but suddenly the same error occurs *every time* it tries to setup the WireGuard proxy. Same errors, either error: closed or error: device offline.

I even started to use wait-for-device, which should actually prevent these kinds of errors, but still. Of course, if I run the commands in my shell, they work just fine with the same emulator.

tweaselORG / appstraction

Starting frida might fail after device reset #32