tweaselORG / cyanoacrylate

Toolkit for large-scale automated traffic analysis of mobile apps on Android and iOS.
MIT License
5 stars 1 forks source link

Stdout stream is not flushed deterministcally for mitmdump #5

Closed zner0L closed 1 year ago

zner0L commented 1 year ago

While trying to write a proper example I ran into an issue, where the program would timeout all the time while waiting for mitmproxy to start. After a lot of investigation, it turns out that the problem is detecting whether mitmproxy started already. Right now, we do this by waiting for the start text to appear in the output, e.g. "listening at". But this fails often, because the stdout stream is not flushed before the timeout runs out, if nothing new is added to the output, leaving the awaitProcessStart to timeout. I have not found a way to make execa to flush the stream earlier.

To resolve this, I either

zner0L commented 1 year ago

Actually, it seems like the flushing is deterministic. If I output the chunks, the all end after HTTP requests in the mitmdump output, e.g. (the numbers are added by me to count the chunks):

0 [17:54:02.251] HTTP(S) proxy listening at *:8080.
[17:54:02.792][127.0.0.1:51512] client connect
[17:54:02.820][127.0.0.1:51512] server connect 142.250.185.138:443
[17:54:03.451][127.0.0.1:51524] client connect
[17:54:03.583][127.0.0.1:51524] server connect 216.58.212.170:443
[17:54:05.194][127.0.0.1:51528] client connect
[17:54:05.227][127.0.0.1:51528] server connect 81.200.195.199:443
[17:54:05.353][127.0.0.1:51534] client connect
[17:54:05.358][127.0.0.1:51546] client connect
127.0.0.1:51528: GET https://81.200.195.199/addons/fachkonfig-utf8.cfg
              << 200 OK 19.1k

1 [17:54:05.389][127.0.0.1:51534] server connect 81.200.195.199:443
[17:54:05.398][127.0.0.1:51546] server connect 23.42.21.134:443
127.0.0.1:51546: HEAD https://23.42.21.134/addons/dbmobile.cfg
              << 200 OK 0b

2 127.0.0.1:51534: GET https://81.200.195.199/addons/bhflageplaene.cfg
              << 200 OK 1.9k

I can provoke the flushing if I open an App that connects to mitmproxy, but if we leave the device/emulator alone, we only get the first chunk if the phone happens to make a HTTP request.

baltpeter commented 1 year ago

Oh yeah, I remember running into the same problem for the data safety label analysis. Probably something changed in mitmproxy (or Node/execa?) since I originally wrote the code.

My "solution" back then was to increase the timeout to a ridiculous value, it did always flush after a fairly long time:

https://github.com/datenanfragen/android-data-safety-label-analysis/blob/e7cea9f5df4e0d212631760e04d784b8285eb7ed/src/traffic.ts#LL93C17-L93C91

baltpeter commented 1 year ago

a different way to know mitmproxy is running

I mean, you could check that the port is closed beforehand and then wait for it to be opened after launching mitmproxy… But I'm not sure whether "port open" == "mitmproxy ready".

zner0L commented 1 year ago

Weirdly enough, if I add process.stdout in execa directly, it doesn't seem to have this flushing problem:

execa('mitmdump', ['-w', flowsOutputPath], { stdout: process.stdout });

However, for weird reasons we cannot just add our own streams into that to test if that would solve the problem (see https://github.com/sindresorhus/execa/issues/81).

zner0L commented 1 year ago

I think, the cleanest way to do this would be to write an addon for mitmproxy that emits the lifecycle events as IPC events on a socket. But this would be some additional work. What do you think, @baltpeter?

zner0L commented 1 year ago

This would also enable us to detect errors, such as a misconfigured TLS encryption.

baltpeter commented 1 year ago

I think, the cleanest way to do this would be to write an addon for mitmproxy that emits the lifecycle events as IPC events on a socket. But this would be some additional work. What do you think, @baltpeter?

I mean, that would work and be robust of course. But that seems like a lot of work just to know whether mitmproxy has started. And it would require us to carry around that addon (which is exactly what we would ideally like to avoid at some point with the HAR addon)…

baltpeter commented 1 year ago

This would also enable us to detect errors, such as a misconfigured TLS encryption.

OK, that's a fair point. It would be especially nice if the user could see when the cert pinning bypass fails.

baltpeter commented 1 year ago

I'm not happy about the added complexity, but I'll leave the decision to you. If you think that's the best way forward, that's fine. (Have you checked whether mitmproxy might have a similar feature in core already?)

zner0L commented 1 year ago

I don’t think it should be too complicated. IPC seems easy enough (though I worry about Windows): https://stackoverflow.com/questions/23804434/python-process-forked-by-nodejs-alternative-to-process-send-for-python

I’ll give it a try.

zner0L commented 1 year ago

Where should I put the addon? src seems wrong smh.

baltpeter commented 1 year ago

Can we make that a separate Python package that the user can install as a dependency with pip and so that it can be useful for others? Or would that be too annoying?

zner0L commented 1 year ago

Ok, so this stackoverflow made me think using the node IPC funtions is easy, even though it is not offically supported. And it is very easy, in fact. I can just write to the correct file descriptor in python and the message is passed to the parent node process:

import os

os.write(3, bytes('{"test" : "This is a test"}\n', "utf8"))

The parent looks like this:

const { spawn, } = require('child_process');

const proc = spawn('python', ['./child.py'], {
    stdio: ['pipe', 'pipe', 'pipe', 'ipc'],
});
proc.on('message', (msg) => console.log(msg)); // { test: "This is a test" }

This is all fine and good on linux, but when it comes to Windows it is an entirely different and even less supported story. I am unsure if I should pursue this, whether we even want to support Windows or if I should just use sockets.

zner0L commented 1 year ago

Ok, having investigated a little more, I found out that nodejs uses libuv to create the pipes, which on Windows creates a named pipe. This is somehow connected to the file descriptor in node which is OS-independent (it is just iterated) and which we do know. But I have no idea how libuv gets the pipe out of that again. nodejs calls uv_pipe_open(), so I guess that could do it. I was a little hopeful when I found https://github.com/saghul/pyuv, but this is no longer actively maintained and fails to build (see https://github.com/saghul/pyuv/issues/270#issuecomment-925106097).

So, having tried quite a lot, I think we will have to give up Windows support for now. I haven't tested the rest of the code on Windows, yet, anyway. I guess as a workaround we can also skip waiting for mitmproxy on Windows and just assume it started correctly if the process didn't fail, but I really dislike that.

zner0L commented 1 year ago

Because python package are really bad for distributing scripts, I don’t see a great way to distribute the addon, for now I just put it in a different repository: https://github.com/tweaselORG/mitmproxy_ipc_events_addon