Music export - Githubissues

tsani commented 7 years ago

As discussed on the mailing list, Apollo should support some kind of export feature, for offline listening.

tsani commented 7 years ago

Directly offering the available files in the music library for download would not be wise, because some of them are quite large. Apollo should offer a feature to transcode tracks in its database to other formats. These can then be archived and made available to the user.

tsani commented 7 years ago

I think that a client-driven approach is best. The client issues an HTTP request for each track to transcode, and the transcoding process is performed synchronously with the HTTP request. Once the client has decided that all the tracks it's interested in have been transcoded, it issues an archival request.

tsani commented 7 years ago

Commit cb6eea7 implements the /transcode endpoint for transcoding tracks. Transcodes are stored in a content-addressable database implemented in the filesystem using a double-nested directory structure. For example, suppose we have a track in "A/B/foo.flac". To transcode it to MP3 V2, we would make a POST request to /transcode with the body

{
  "source": "A/B/foo.flac",
  "params": { "format": "mp3", "bitrate": { "type": "vbr", "value", 2 } }
}

The file will be transcoded with those parameters, and the output will be stored in /transcoded/X/mp3-v2/foo.mp3, where X is the the SHA-1 hash of /music/A/B/foo.flac's contents. The SHA-1 hash of the track is returned by the POST to /transcode. The client can download a transcoded file by making a GET request to /transcode/X/Y where X is the track ID (SHA-1 hash of the track's contents) and Y represents the transcoding parameters. For example, to represent MP3 V2, use mp3-vbr-2; for constant bitrate 192kbps MP3, use mp3-cbr-192.

When issuing a POST to /transcode, if the desired output already exists, then the track ID is simply returned and no side-effects are performed server-side. When issuing a GET to /transcode, if the desired transcode does not exist, then a 500 error is produced. This latter behaviour is a bug, and we should return a 404.

For maximum restfulness, the POST to /transcode should return a complete URL that can be used to retrieve the transcoded file (perhaps via an HTTP Link header).

Also, we should adopt a uniform naming scheme for identifying tracks. I think that different JSON objects for requests use different names for the field that identifies a track in the /music directory.

Another thing to straighten out should be the linear serialization of transcoding parameters. In the filesystem we write "mp3-v2" but in the request path we write "mp3-vbr-2". The latter format is better IMO. (The "mp3" portion maps to the Format datatype, and the "vbr-2" maps to VBR Q2 :: Bitrate, whereas parsing v2 is a bit trickier than just splitting on dashes).

I think that the current system of issuing a GET to /transcode/:hash/:params is bad though. Instead, we should issue a GET to /transcode/:track/:params, have the server compute the hash of :track, and use that and the params to look up the transcode in the filesystem. Hence, the client doesn't need to care about track IDs. Also, if the GET fails, we can return a 404 with a Link header to allow the client to create the transcode if the track exists; we can also return links to other available transcodes of the track, if any. However, track IDs are nice because they are immune to changes in the track location. A track can be moved around in the /music directory without affecting its SHA-1 hash. However, the hash is sensitive to changes in the file, such as ID3 tags. I guess my conclusion is that both kinds of identifiers have pros and cons, and neither is perfect :disappointed:

tsani commented 7 years ago

Commit 5138ee6 implements the POST method for the /archive endpoint, which is the missing piece for a good music export system. Commit fc22e48 implements a bash script providing a pretty nice user experience for exporting music.

Regarding the discussion about trackIDs versus track paths, no progress has been made, and we continue to use trackIDs to identify transcodes. Going forward, I think we should replace the URL path-based scheme for identifying transcodes with query-string parameters. Then, we can make a request like GET /transcode?params=mp3-vbr-2&trackPath=<some_path> to get the V2 transcode of the track at the given path, if any. This format opens the door for new varieties of requests, like GET /transcode?params=mp3-vbr-2 to select all mp3 V2 transcodes that exist. This however is a bit tricky because transcodes to not store the path to the track that created them. I think that GET requests to bare /transcode should return a list of objects with the fields "trackId", "transParams", and "url". The former two are precisely what are needed to construct an ArchiveEntry object for issuing an archive request later, and the latter is the complete URL to needed to download just this transcode.

tsani / apollo

Music export #6