Closed sashahilton00 closed 3 years ago
Never done that so I can't promise I can help, but I might give it a try, it sounds interesting! Is there a place where you (and others) will be uploading your findings?
Ok, so what I'm going to do is dedicate a comment to each endpoint that I reverse. I'll attach screenshots of the decoded protobufs (request/response) where it may be helpful, and reference the endpoint, along with the proto definitions.
The proto files I'm using I extracted earlier, available here.
The proto descriptor is useful for anyone looking to do their own poking around, uploaded here. Alternatively, you can compile if from the proto definitions above using a variant of the command:
protoc --descriptor_set_out=descriptor.desc **/*.proto *.proto
Endpoint: https://spclient.wg.spotify.com/extended-metadata/v0/extended-metadata
Proto definitions (refer to extended_metadata.proto
):
spotify.extendedmetadata.BatchedEntityRequest
. It may also accept spotify.extendedmetadata.EntityRequest
, but untested. spotify.extendedmetadata.BatchedExtensionResponse
. Endpoint: https://spclient.wg.spotify.com/extended-metadata/v0/extended-metadata
Proto definitions (refer to ucs.proto
(user customisation service)):
spotify.remote_config.ucs.proto.UcsRequest
.spotify.remote_config.ucs.proto.UcsResponseWrapper
. I didn't think much of ths endpoint initially, it seemed to just be a bunch of A/B test flags, but on closer inspection, it appears to carry a bunch of possibly useful information. Besides feature flags (one of which is Spotify HiFi), it has buffer parameters, prefetch configuration, account status and permissions, api base urls, etc. Instead of a screenshot, I've posted the request/response json below (converted from protobuf)
@roderickvd loudness-levels
might be interesting for you. There's also some other related stuff.
Endpoint: https://spclient.wg.spotify.com/connect-state/v1/devices/2e8...202
Proto definitions (connect.proto
):
connectstate.Cluster
message, though I want to double check this. It looks pretty similar to the web api connect state endpoint, there may be a few additional fields.To implement the repeat single/context functonality on the client, one needs to look at the PlayerState
(player.proto) entry of connectstate.Cluster
. Within that there is a ContextPlayerOptions
entry, which indicates whether repeat functionality is enabled and what type.
ContextRestrictions
can also be seen in player state, and should be used to determine what functionality is enabled. For example, the disallow_pausing_reasons
context restriction sometimes appears if the player is paused, with the somewhat self-explanatory already_paused
message. Granted, not a very useful example, but we should probably pay attention to these context restrictions and pass through the reason for disabling either to stdout or the application using librespot. Example below:
restrictions {
disallow_pausing_reasons: "already_paused"
disallow_skipping_prev_reasons: "no_prev_track"
}
Capabilities
also has some new fields that indicate whether a device supports Spotify HiFi. There are some other fields within connectstate.CapabilitySupportDetails
such as fully_supported
and user_eligible
that presumably need to be set for the option to appear. See below:
supports_hifi {
device_supported: true
fully_supported: false
user_eligible: false
}
Also, there are some other fields within the device message such as metadata_map
that look to be the local network details of the client, notably device_address_mask
and tier1_port
, both of which appear to correspond to the mDNS/SSDP discovery that spotify clients do. It might be worth investigating adding this information to boost speed and reliability of device discovery.
The device-state endpoint needs quite a bit more exploration, and probably warrants a thread of its own, as it seems to be the bucket into which Spotify is dumping most of the player state used when syncing up devices.
account_attributes {
key: "loudness-levels"
value {
string_value: "1:-9.0,0.0,3.0:-2.0"
}
The comma-separated values probably map to Spotify's loudness levels. This is old documentation from the net:
Loud – equalling ca -11 dB LUFS (+6 dB gain multiplied to ReplayGain) Normal (default) – equalling ca -14 dB LUFS (+3 dB gain multiplied to ReplayGain) Quiet – equalling ca – 23 dB LUFS (-5 dB gain multiplied to ReplayGain)
Currently Spotify only documents the LUFS value and no longer the dB gain. Where dB is assumed to be dBFS and LUFS is k-weighted. So the documentation above is indicative but not necessarily normative.
If you add 3 dB to the comma-separated you're more-or-less meeting the indicative documented values.
I'd have to think harder what the first "1" and last "-2.0" indicate. Taking a guess: "1" for the scheme version number and "-2.0" for the -2 dB true peak that Spotify targets.
@sashahilton00 general question: so I understand librespot-java
has implemented large parts of the new API. What can we reuse from their reverse engineering vs. your posts above? I'm just entirely new to these parts so trying to understand the lay of the land here.
Ping me if you need any help.
@roderickvd that's a good question. From the issue emails I've been getting from librespot-java, it looks like @devgianlu and co. have put quite a bit of work into reverse engineering chunks of the new API, hence in some cases the stuff I post here has likely been reverse engineered in part or fully over at librespot-java. @devgianlu would be best placed to answer the question of what has been reverse engineered that we can use, something of note is the dealer endpoint, which if we want to support group listening at some point, will need to be added, as thisis where group listening context seems to be published to. Other than that, it just sends pingpong requests unless I am missing something.
My reverse engineering is simply me capturing and examining the network traffic between desktop <-> Spotify, so what I post should be useful enough as a reference, but in terms of what we focus on, we should probably look to discussions and their upvotes, along with features that we thnk are going to be increasingly important when working out what to prioritise.
In my mind, there are a few things that are must haves before the library hits 1.0.0
, which are:
removal of surplus audio backends. Currently my thinking is that retaining the pipe
, rodio
(default), alsa
and gstreamer
backends makes sense, on the understanding that the alsa
and gstreamer
backends are being actively maintained. Given the processing that we are already doing with normalisation, dithering, shaping, etc. combined with the slow movement on issues upstream in rodio, I am inclined to think that it may be worth dropping rodio in favour of cpal as a lighter alternative, given that we're not using the functionality that rodio provides (multi format decoding, dynamic mixing, filters, etc.). If we feel that this is going to change, for example with the introduction of HiFi, then perhaps this is worth reevaluating.
remove redundant decoders. tremor
and libvorbis
should both be removed I think. The former because it was introduced iirc for the Pi rev 1, which didn't have a hardware FPU. The latter because it a) relies on bndings to a C library, and b) the crate appears unmaintained.
support session reconnection. This is being tracked in #609 but requires a level of rust experience that exceeds my own. The reason for this being a requirement in my mind is that there's a trove of issues that appear to be connected to session reconnection not really being handled, which one wouldn't really consider acceptable in a 'stable' library.
retrieve files from the CDN. Again, not technically a requirement, but from what I've seen in recent additions such as canvas, podcasts, etc. these are defaulting to being supplied via the CDN, and there is clearly a push to move to use the CDN at Spotify for all new projects. Besides avoiding the nasty surprise when Spotify eventually decides to turn off the mercury endpoint one day, there should be a small performance boost in that audio files will be served from edge locations as opposed to from Spotify servers, which may help reduce overall load times and responsiveness.
there has been plenty of discussion in #648 around splitting out discovery functionality, moving audio playback and other features to Spotifyd, etc. If we are going to do that, it should be done before we hit stable. Personally I am in favour of moving most of the playback functionality to Spotifyd, and only retaining the pipe
and cpal
backends in librespot, though multiple people have said that they want alsa
and gstreamer
in librespot and are willing to maintain them, n which case I don't think forcing migration is necessary.
Besides the above, there are other things that would be 'nice to have', but are by no means essential. Things like extended metadata can be added in later versions, the above is just some of the stuff which I think is required for it to be released as stable. If people have a different view, I'm open to changing my mind. In the meantime, I will keep documenting the protocols when I have a moment and adding them here, since it's just grunt work that I can do when I have an hour or two spare. Anyone else feel free to do the same.
Looking back, the above is a bit of a roaming response, but hopefully it provides some clarification on why I'm spending the time to examine the latest protocols and the overall direction that I'm working to as a basis.
Endpoint: POST https://clienttoken.spotify.com/v1/clienttoken
Proto Definitions missing.
This endpoint is where the client token is retrieved from, which is subsequently used in a multitude of requests. The request/response are shown below. Further investigation is required to determne the proto definitions, which will be updated accordingly in the request/response once discovered.
65b7...
appears to be a Spotify Client ID. This is also used in the login5 endpoint.
1: 1
2 {
1: "1.1.58.820.g2ae50076"
2: "65b7080...33ca87bd"
3 {
1 {
4 {
1: 10
3: 17763
4: 2
6: 9
7: 332
8: 34404
}
}
2: "S-1-5-21-4265178016-72824351-3788689935"
}
}
1: 1
2 {
1: "AABlJC8n81KZ3mO8VAbi9H9B6...z2FjzT+gISTWoyP0="
2: 1216800
3: 1209600
4 {
1: "spotify.com"
}
}```
@sashahilton00 Should this become a discussion? I feel like I could comment on stuff, but it'll end up being an awful long thread.
I thought it already was, just realised it's still in issues. Will move it.
Over the course of the past few years, Spotify has updated it's protobufs several times, along with its endpoints. This was to be expected as they continued to add functionality, nonetheless librespot is now pretty significantly diverged from the protobuf definitions that are in use in the official clients, and possibly at higher risk of experiencing issues with endpoints being depreciated at short notice, such as the recent issues Facebook login flow (admittedly not due to protobuf definition issues though).
Furthermore, newer functionality such as canvas, context management, streaming groups, etc. is implemented in the newer
proto3
definitions, attached below.Given that Spotify is slowly standardising to HTTP/Websockets/GRPC transports, and moving away from Hermes, it may be good for this project in the long run if we were to try to follow suit. To this end, further reverse engineering of the current endpoints that are in use will need to be carried out, and the existing code updated where necessary. As a side effect of this, my gut tells me that a number of the smaller issues/feature requests (eg. repeat functionality only partially working, more 'spotify-like' normalization, etc.) will be easier to implement once we are using their up to date APIs.
I will start investigating when I get some free time, and add to this comment accordingly with any findings that I turn up. Anyone else is also invited to crack out
spotify-dissect
and start digging through the network traffic. If the log of notes starts to become unwieldy we can always form some sort of to-do list or stick the information in a wiki page temporarily.The latest protobufs: spotify_protobufs.zip