postlund / pyatv

A client library for Apple TV and AirPlay devices
https://pyatv.dev
MIT License
836 stars 89 forks source link

Time to figure out where to go from here: MRP is no more(?) #1220

Closed postlund closed 2 years ago

postlund commented 2 years ago

What to investigate Ok, so after reading this comment: https://github.com/evandcoleman/node-appletv/issues/53#issuecomment-881627661, it's pretty much confirmed that MRP is removed in tvOS 15. It's not possible to tell if Apple is phasing it out yet, guess time will tell. This however means that we don't have working protocol (stack) to control and retrieve metadata from Apple TV's anymore. I wouldn't say we're back at square one as we already know that they are using Companion (which I have already reverse engineered most important parts of), possibly in conjunction with AirPlay 2. But it's not entirely clear how the inner details work. So, in order to not get caught with our pants down our ankles once tvOS 15 is released, focus must shift towards getting the missing pieces in place.

Due to vacation, work, family life and all that I don't have the capacity to put as much time as I would need to crack the remaining parts. So I urge anyone with some spare time to help me out with this. I will try to share my thoughts here...

What we know so far The Companion protocol is used. I believe everything in the control center widget is communicated via Companion, so we should be able to replicate what's in there. I have started the work to add the remote control buttons I have found so far (that overlaps with the interface in pyatv) in #1218. Most buttons as covered here, but not all, e.g. no specific buttons for play or pause (just play/pause toggle).

Events are supported by the Companion protocol, so it's technically possible to subscribe to certain events. I haven't looked into this, so I don't know how it works. The method is called _interest at least. I'm however not convinced that Companion is used for metadata updates (but maybe other things, like volume updates). The reason for my suspicion is that Companion is very lightweight by design, which is fine for general metadata (like title and artist) but not for artwork. Receiving a large chunk of artwork when navigating with touch controls (or any other controls) would induce a delay with what is shown on screen. Apple would never allow something like that and in MRP they implemented a "transaction" scheme, making it possible to segment large data chunks into smaller ones so higher priority messages (like touch events) should slip in between. This makes me believe that AirPlay 2 is used for metadata (including artwork), but I have not confirmed that.

AirPlay 1 supports events so that the server can inform about state changes, like volume updates or if audio should be paused due to someone pressing a physical button on the receiver. This channel however is only active for someone that is streaming, so you can't "eavesdrop" externally to see what is currently playing. Only the sender knows that (at least from what I have seen). Maybe this situation has changed somehow in AirPlay 2? First order of business is nonetheless: figure out what protocol is responsible for metadata. I see two potential ways of doing this:

I would be so happy đŸ„ł to get some help verifying these methods for me! First step is to understand where the magic happen, so that effort can be put in the correct place.

Expected outcome Basically how to deal with #1190, #1191 and #1192

elvisimprsntr commented 2 years ago

I am willing to help, but if the traffic is encrypted how is it possible to reverse engineer the protocol?

Screen Shot 2021-07-17 at 10 38 44 Screen Shot 2021-07-17 at 10 48 14
postlund commented 2 years ago

The encryption for both Companion and and AirPlay 2 is already known and there are implementations for both, so we are at the stage where we can setup a connection but don't know what to send there basically. AirPlay 2 encryption isn't implemented in pyatv yet, but would be a feasible task if deemed necessary.

At this stage I'm mostly interested in pinpointing the responsibility of each protocol, e.g. which deals with metadata. Once we know, we can use various other methods to figure out what is going on, e.g. log output in System messages, disassembling the app or in some cases attach via debugger. For MRP I managed to build a proxy (atvproxy) which could MITM the traffic more or less, making it available in clear text. Support for Companion is there as well, but I'm doing something wrong so a crypto signature validation fails, making the initiator (iOS device) close the connection. So it's not very useful right now, but could perhaps be fixed making it the best tool for the job hands down.

The service you are looking at is related to Thread and I don't see that being very relevant right now. It's mainly Companion and AirPlay that I'm interested in for now, unless someone finds anything juicy within the other protocols.

mschwartz commented 2 years ago

FWIW, my hope is that the pyatv API remains the same, regardless of what's going on behind the scenes.

:)

I made the mistake of upgrading one of my ATVs to tvOS 15 and ran into this bug immediately.

postlund commented 2 years ago

@mschwartz That is something I can guarantee it will 😊

mschwartz commented 2 years ago

As an aside, is it possible to try one protocol after another until you get a valid connection, somewhere within pyatv instead of us having to pass in some parameters?

I was looking through your closed PRs and saw "breaking change" in one of them...

postlund commented 2 years ago

I think https://github.com/postlund/pyatv/issues/1209 is what you are looking for? Not implemented yet though.

mschwartz commented 2 years ago

https://github.com/postlund/pyatv/pull/1213

postlund commented 2 years ago

That will allow you to scan for a particular protocol (or more), and try them out one by one by manually using pyatv.connect. With the connect strategy in #1209, you can just add all protocols at once, call connect and have pyatv ignore failing protocols for you. So it's easier to use in the end.

mschwartz commented 2 years ago

My current connect code looks like this (is why I asked about API changing): 2021-07-23 at 2 45 PM

postlund commented 2 years ago

If I understand you correctly, you want to try to connect with MRP first and then try for instance Companion if the doesn't work?

mschwartz commented 2 years ago

Something like that... or try Companion first after 15 comes out. I'd prefer to just use the latest pyatv and have it just work.

But I don't want to get in your way of making it better and better, either.

postlund commented 2 years ago

Since you basically can't do anything with companion right now (no metadata at least), it's not that interesting to use. It's probably better to wait it out until I make that work, assuming I do. But the connect strategy will solve your "problem" automatically. It's also OK to have multiple protocols implementing the same functionality connected at the same time, the library will pick the most appropriate one when calling an API function. The best thing you can do is to provide credentials for both Companion and MRP.

From what I've heard, tvOS 15 beta 3 doesn't announce the MRP service anymore, so no connection attempt to MRP should be made anymore thus leaving Companion for you to play with.

mschwartz commented 2 years ago

How different is the pyatv API going to be when using Companion?

Different, for example, to get playing info, or send button presses?

postlund commented 2 years ago

Once connected, nothing. It's exactly the same.

mschwartz commented 2 years ago

Awesome news!

Thanks for doing all this work.

postlund commented 2 years ago

I have tried to study traffic patterns a bit and from what I've gatherd so far, it doesn't look like Companion is involved when it comes to metadata. AirPlay looks far more promising for that. Some interesting things I can see in the logs are this (just an excerpt, not complete log):

förval  23:36:20.142013+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Control pair-verify CU, type 3, count 1
förval  23:36:20.172756+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Control pair-verify CU, type 3, count 2
förval  23:36:20.182373+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] F 1
förval  23:36:20.186738+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] F 2
förval  23:36:20.190795+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Setup
förval  23:36:20.191427+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Created session [0x28A4] (RO=1 TS=0 HT=0 SAI=0 TP=None)
förval  23:36:20.191581+0200    TVAirPlay   [APReceiverSessionManager] Adding session 0x9482
förval  23:36:20.208431+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Record
förval  23:36:20.214185+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Sending system info update after session starts.
förval  23:36:20.214787+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Sent system info update.
förval  23:36:20.230458+0200    TVAirPlay   [APReceiverRequestProcessorAirPlay] [0x9482] Setup
förval  23:36:20.230703+0200    TVAirPlay   [AirPlay] [0x28A4] Sender wants a dedicated socket for RCS-1
förval  23:36:20.231115+0200    TVAirPlay   [APReceiverSessionManager] Register direct route 15 [0x9482-1]
förval  23:36:20.231448+0200    TVAirPlay   [APMediaDataControlServer] [0x29BE 'RCS-MediaRemote'] with 5-sec timeout created.
förval  23:36:20.231566+0200    TVAirPlay   [APReceiverRemoteControlSessionMediaRemote] [0x9265] Listening for connection on port 49204
förval  23:36:20.231676+0200    TVAirPlay   [APReceiverRemoteControlSessionMediaRemote] [0x9265] RCS-1 (direct) created for MediaRemote client, route 15, client UUID 'xxx'
förval  23:36:20.231991+0200    TVAirPlay   [APMediaDataControlServer] [0x29BE 'RCS-MediaRemote'] starting.
förval  23:36:20.232096+0200    TVAirPlay   [APReceiverMediaRemoteXPCService] Registred new commChannel 15
förval  23:36:20.237756+0200    TVAirPlay   [APMediaDataControlServer] [0x29BE 'RCS-MediaRemote'] accepted connection from 10.0.10.101:49166.

My educated guess is that the client (in this case my iPhone) sets up a new new RTSP session and somehow requests a side-channel used for media remote commands. Normally, in AirPlay v1, there are separate channels for timing, control (e.g. retransmit) and data (audio packets) but in AirPlay v2 there's also an event channel. Owntone recently implenented supported for buttons on the HomePod based on events sent on that channel, commit is here. Maybe this is the same channel? I will have to investigate that when I have time. But it's interesting at least. Being able to look at the actual traffic would have helped a lot (it's encrypted)...

Anyway, looks like AirPlay v2 is the best bet right now. I might try to experiment with owntone a bit and see if I can just stop after SETUP. I'm curious what happens if I setup a session, but don't actually stream anything as I don't want to interfere with what is currently playing.

postlund commented 2 years ago

The Sender wants a dedicated socket for RCS-1 message seems to be triggered by a parameter called wantsDedicatedSocket in SETUP. Might be interesting to investigate the outcome of that...

postlund commented 2 years ago

This SETUP message.

postlund commented 2 years ago

So, it took some time but I did actually manage to reproduce the behavior above with my old PoC code I used when troubleshooting streaming with AirPlay (that gave me the authentication part). So pyatv behaves like expected (to some extent at least, I guess). Prototype code looks like this:

body = {
    "streams": [
        {
            "wantsDedicatedSocket": 1,
            "type": 130,
            "clientUUID": "1910A70F-DBC0-4242-AF95-115DB30604E2",
            "clientTypeUUID": "1910A70F-DBC0-4242-AF95-115DB30604E1",
            "seed": 1234,
            }
        ]
    }

resp = await self.send_and_receive(
    "SETUP",
    body=plistlib.dumps(body, fmt=plistlib.FMT_BINARY),
    content_type="application/x-apple-binary-plist",
    )

The clientTypeUUID must have the specified value. There are a few others (at least two more) for other purposes beyond the scope of what I'm doing. A typical response to this message is:

{"streams": [{"type": 130, "streamID": 1, "dataPort": 49489}]}

The port number is the same one as mentioned in the log above ([APReceiverRemoteControlSessionMediaRemote] [0x9265] Listening for connection on port XXX). It is possible to connect to it but it is however torn down whenever another AirPlay client connects to the receiver (just like the regular RTSP connection). I also have no idea what is sent over this connection. It is encrypted by Chacha20Poly1305 as expected, but I don't know what parameters are used to derive the session key. Since the receiver doesn't send anything back to me either, I can't really work my way backwards either. So maybe this is a dead end?

postlund commented 2 years ago

Looking at what traffic is sent between the phone and Apple TV still suggests that AirPlay is used, so I will have to dig deeper.

postlund commented 2 years ago

Just pouring it all out here... I think I'm gonna put some time into adding support for AirPlay 2 and transient pairing in atvproxy. That way I can hopefully have a look at the traffic. It's a bit more tricky though as I have to deal with all the additional ports used for AirPlay for timing, events and control. Should be doable but will require some work.

As a first test, I just pulled the Zeroconf properties from my HomePod, made some modifications (other identifiers) and stuck them into the current AirPlay implementation of the AirPlay fake device. That allowed me to confirm that my phone found it correctly:

F4E1C74C-0135-40DB-A45E-559B393B7C13

postlund commented 2 years ago

I'm closing this now as I know what to do now (#1255) 👍