mkb79 / Audible

A(Sync) Interface for internal Audible API written in pure Python.
https://audible.readthedocs.io
GNU Affero General Public License v3.0
308 stars 38 forks source link

Timeout for httpx. #36

Closed BlindWanderer closed 11 months ago

BlindWanderer commented 3 years ago

Describe the bug When fetching the library (calling aget_from_api in models.py::Library), if the library is very large (or transfer is slow), the underlying httpx call can timeout,

Workaround Replace: resp = await api_client.get("library", params=request_params) With: resp = await api_client.get("library", params=request_params, timeout=None)

Suggested Solution Expose a timeout parameter in aget_from_api. This makes the problem the consumer of the library and avoids the infinite timeout employed by the workaround.

mkb79 commented 3 years ago

Hi.

Thank you for your advice. I can implement this. But can you re-open this issue on the audible-cli repo?

Thank you very much!

BlindWanderer commented 3 years ago

Sure, np. I really like what you've done.

P.S. I wrote my own aaxc decryp that doesn't use ffmpeg if you want.

mkb79 commented 3 years ago

Decrypting aaxc without ffmpeg sounds really fantastic. Not everyone has a patched ffmpeg bin with aaxc patch applied. Have you realized that in pure Python. I would be very greatful if you send me your code.

P.S.: If you don’t want to make it public, you can send me a pn to mkb79@hackitall.de!!!

BlindWanderer commented 3 years ago

I've got a working version in Java, I think I wrote a proof of concept in Python first but I'm having trouble finding it. Give me a few days to rewrite it in Python.

aax[c]? is at the file structure level mp4, you can feed it through any mp4 parser. The only difference is they change some tag names, encrypted the samples and put in some extra padding in the sample section. Crypto wise, it's decent, they've done everything right (including leaving themselves space to completely rework how the crypto is done). As to the complexity of the code, it's pretty simple. More code is spent doing tag renaming then anything else.

I wouldn't release my code but ffmpeg and yourself have let the cat out of the bag so... it's moot now.

BlindWanderer commented 3 years ago

You are welcome to use this as you see fit as long as it isn't used for commercial use or violating the rights of others.

It doesn't meet Python naming conventions (I can't be bothered) and you are welcome to clean it up. You should have seen the Java code I translated this from. It's super ugly. I had a lot of fun writing both this and that.

How it works: It walks the mp4 atom tree, replacing atom ids and decrypting samples without changing the file size or layout. It does this in a single pass and without saving any state as it goes. It's memory footprint consequently should be quite small.

Wait... what? Doesn't Audible add all this extra stuff to the file? MP4 parsers don't care about extra data or unrecognizable atoms. Additionally the crypto used doesn't change the size of the samples.

If you want to learn more about the mp4 format I recommend getting Bento4, it's a really easy mp4 toolkit that isn't the mess that ffmpeg is. But I digress.

Don't you need to jump around the file from the sample tables into the mdat section to perform decryption? How is it that you don't have a sample table or file seeking? Audible violated the mp4 standard and put extra metadata in the mdat section that makes it possible to find and decrypt the samples without needing the sample tables.

Areas for Improvement

Limitations

import struct
from Crypto.Cipher import AES

class Translator:
    fshort = (">h", 2)
    fint   = (">i", 4)
    flong  = (">q", 8)
    def __init__(self, size = None):
        self.buf = bytearray(size if size != None else 4096)
        self.pos = 0
        self.wpos = 0
    def reset(self) : 
        self.pos = 0
        self.wpos = 0
    def position(self): return self.pos
    def getShort(self): return self.getOne(self.fshort)
    def getInt(self)  : return self.getOne(self.fint)
    def getLong(self) : return self.getOne(self.flong)
    def putInt(self, position, value) : self.putOne(self.fint, position, value)
    def getOne(self, format) :
        r = struct.unpack_from(format[0], self.buf, self.pos)[0]
        self.pos = self.pos + format[1]
        return r
    def putOne(self, format, position, value) :
        struct.pack_into(format[0], self.buf, position, value)
    def readOne(self, inStream, format):
        length = format[1]
        self.buf[self.wpos : self.wpos + length] = inStream.read(length)
        r = struct.unpack_from(format[0], self.buf, self.pos)[0]
        self.wpos = self.wpos + length
        self.pos = self.pos + length
        return r
    def readInto(self, inStream, length) -> int:
        self.buf[self.wpos : self.wpos + length] = inStream.read(length)
        self.wpos = self.wpos + length
        return length
    def readCount(self) -> int: return self.wpos
    def write(self, *outs) -> int:
        if self.wpos > 0:
            data = self.buf if self.wpos == len(self.buf) else self.buf[0 : self.wpos]#fuck you python and your write function that can't sublist!
            for out in outs:
                out.write(data)
            return self.wpos
        return 0
    def readInt(self, inStream):
        return self.readOne(inStream, self.fint)
    def readLong(self, inStream):
        return self.readOne(inStream, self.flong)
    def skipInt(self): self.skip(self.fint[1])
    def skipLong(self): self.skip(self.flong[1])
    def skip(self, length): self.pos = self.pos + length
    def readAtomSize(self, inStream):
        atomLength = self.readInt(inStream)
        if(atomLength == 1): #64 bit atom!
            atomLength = translator.readLong(inStream)
        return atomLength
    def zero(self, start = 0, end = None):
        if end == None:
            end = wpos
        for i in range(start, end):
            self.buf[i] = 0
    def write_and_reset(self, *outs) -> int:
        r = self.write(*outs)
        self.reset()
        return r

class AaxDecrypter:
    filetypes = {6:"html", 7:"xml", 12:"gif", 13:"jpg", 14:"png", 15:"url", 27:"bmp"}
    def __init__(self, session, key, iv, inpath, outpath):
        self.session = session
        self.key = bytes.fromhex(key)
        self.iv = bytes.fromhex(iv)
        self.source = inpath
        self.dest = outpath
        self.filesize = inpath.stat().st_size

    def walk_ilst(self, translator, inStream, outStream, endPosition): #cover extractor
        startPosition = inStream.tell()
        while inStream.tell() < endPosition :
            translator.reset()
            self.status(inStream.tell(), self.filesize)
            atomStart = inStream.tell()
            atomLength = translator.readAtomSize(inStream)
            atomEnd = atomStart + atomLength
            atom = translator.readInt(inStream)
            remaining = atomLength - translator.write_and_reset(outStream)

            if(atom == 0x636F7672): #covr
                #Going to assume ONE data atom per item.
                dataLength = translator.readAtomSize(inStream)
                translator.readInto(inStream, 12)
                translator.skipInt()        #data
                type = translator.getInt()  #type
                translator.skipInt()        #zero?
                remaining = remaining - translator.write_and_reset(outStream)
                if type in self.filetypes:
                    postfix = self.filetypes[type]
                    with self.dest.with_suffix(".embedded-cover." + postfix).open('wb') as cover:
                        remaining = remaining - self.copy(inStream, remaining, outStream, cover)

            if(remaining > 0):
                walked = False
                self.copy(inStream, remaining, outStream)
            self.checkPosition(inStream, outStream, atomEnd)

        self.status(inStream.tell(), self.filesize)
        return endPosition - startPosition

    def walk_mdat(self, translator, inStream, outStream, endPosition):#samples
        startPosition = inStream.tell()
        #It's illegal for mdat to contain atoms... but that didn't stop Audible! Not that any parsers care.
        while inStream.tell() < endPosition : 
            self.status(inStream.tell(), self.filesize)
            #read an atom length.
            atomStart = inStream.tell()
            translator.reset()
            atomLength = translator.readAtomSize(inStream)
            atomTypePosition = translator.position()
            atomType = translator.readInt(inStream)

            #after the atom type comes 5 additional fields describing the data.
            #We only care about the last two.
            translator.readInto(inStream, 20)
            translator.skipInt()#time in ms
            translator.skipInt()#first block index
            translator.skipInt()#trak number
            bs = translator.getInt()#total size of all blocks
            bc = translator.getInt()# number of blocks

            atomEnd = atomStart + atomLength + bs

            #next come the atom specific fields
            if(atomType == 0x61617664) : #aavd has a list of sample sizes and then the samples.
                translator.putInt(atomTypePosition, 0x6d703461) #mp4a
                translator.readInto(inStream, bc * 4)
                translator.write(outStream)
                for i in range(bc):
                    self.status(inStream.tell(),  self.filesize)
                    sampleLength = translator.getInt()
                    cipher = AES.new(self.key, AES.MODE_CBC, iv=self.iv)#has to be reset every go round.
                    remaining = sampleLength - outStream.write(cipher.decrypt(inStream.read(sampleLength & 0xFFFFFFF0)))
                    #fun fact, the last few bytes of each sample aren't encrypted!
                    if remaining > 0 : self.copy(inStream, remaining, outStream)
            #there is no point in actually parsing this, 
            #we would need to rebuild the sample tables if we wanted to modify it.
            #elif atomType == 0x74657874: #text
            #    translator.readInto(inStream, bc * 2)
            #    translator.write(outStream)
            #    for i in range(bc):
            #        sampleLength = translator.getShort()
            #        t2 = Translator(sampleLength * 2)
            #        t2.readInto(inStream, sampleLength)
            #        t2.getString(sampleLength)
            #        before = t2.readCount()
            #        encdSize = t2.readAtomSize(inStream)#encd atom size
            #        t2.readInto(inStream, encdSize + before - translator.readCount())
            #        t2.write(outStream)
            #    translator.reset()
            else:
                len = translator.write_and_reset(outStream)
                self.copy(inStream, atomLength + bs - len, outStream)
            translator.reset()
            self.checkPosition(inStream, outStream, atomEnd)

        return endPosition - startPosition

    def walk_atoms(self, translator, inStream, outStream, endPosition):#everything
        startPosition = inStream.tell()
        while inStream.tell() < endPosition :
            self.status(inStream.tell(), self.filesize)
            #read an atom length.
            translator.reset()
            atomStart = inStream.tell()
            atomLength = translator.readAtomSize(inStream);
            atomEnd = atomStart + atomLength
            ap = translator.position()
            atom = translator.readInt(inStream)

            remaining = atomLength

            if atom == 0x66747970:#ftyp-none
                    remaining = remaining - translator.write_and_reset(outStream)
                    len = translator.readInto(inStream, remaining)
                    translator.putInt(0,  0x4D344120) #"M4A "
                    translator.putInt(4,  0x00000200) #version 2.0?
                    translator.putInt(8,  0x69736F32) #"iso2"
                    translator.putInt(12, 0x4D344220) #"M4B "
                    translator.putInt(16, 0x6D703432) #"mp42"
                    translator.putInt(20, 0x69736F6D) #"isom"
                    translator.zero(24, len)
                    remaining = remaining - translator.write_and_reset(outStream)
            elif atom == 0x696C7374: #ilst-0
                    remaining = remaining - translator.write_and_reset(outStream)
                    remaining = remaining - self.walk_ilst(translator, inStream, outStream, atomEnd)
            elif atom == 0x6d6f6f76 \
              or atom == 0x7472616b \
              or atom == 0x6d646961 \
              or atom == 0x6d696e66 \
              or atom == 0x7374626c \
              or atom == 0x75647461: #moov-0, trak-0, mdia-0, minf-0, stbl-0, udta-0
                    remaining = remaining - translator.write_and_reset(outStream)
                    remaining = remaining - self.walk_atoms(translator, inStream, outStream, atomEnd)
            elif atom == 0x6D657461: #meta-4
                    translator.readInto(inStream, 4)
                    remaining = remaining - translator.write_and_reset(outStream)
                    remaining = remaining - self.walk_atoms(translator, inStream, outStream, atomEnd)
            elif atom == 0x73747364: #stsd-8
                    translator.readInto(inStream, 8)
                    remaining = remaining - translator.write_and_reset(outStream)
                    remaining = remaining - self.walk_atoms(translator, inStream, outStream, atomEnd)
            elif atom == 0x6d646174: #mdat-none
                    remaining = remaining - translator.write_and_reset(outStream)
                    remaining = remaining - self.walk_mdat(translator, inStream, outStream, atomEnd)
            elif atom == 0x61617664: #aavd-variable
                    translator.putInt(ap, 0x6d703461) #mp4a
                    remaining = remaining - translator.write_and_reset(outStream)
                    self.copy(inStream, remaining, outStream) #don't care about the children.
            else:
                    remaining = remaining - translator.write_and_reset(outStream)
                    self.copy(inStream, remaining, outStream) #don't care about the children.

            self.checkPosition(inStream, outStream, atomEnd)

        self.status(inStream.tell(), self.filesize)
        return endPosition - startPosition

    def status(self, position, filesize):
        None

    def copy(self, inStream, length, *outs) -> int :
        remaining = length
        while remaining > 0:
            remaining = remaining - self.write(inStream.read(min(remaining, 4096)), *outs)
        return length

    def write(self, buf, *outs) -> int :
        for out in outs:
            out.write(buf)
        return len(buf)

    def checkPosition(self, inStream, outStream, position):
        ip = inStream.tell()
        op = outStream.tell()
        if ip != op or ip != position:
            secho("IP: %d\tOP: %d\tP: %d" % (ip, op, position))

def decrypt_local(inpath, outpath, key, iv, session):
    with inpath.open('rb') as infile:
        with outpath.open('wb') as outfile:
            decrypter = AaxDecrypter(session, key, iv, inpath, outpath)
            decrypter.walk_atoms(Translator(), infile, outfile, decrypter.filesize)
mkb79 commented 3 years ago

That’s really fantastic. Thank you for your hard work! Very clever! I will try it out as fast as possible and will report here.

P.S.: I‘m a single developer without any company in the background. I don’t want to make money with my projects, I only want to fill my sparetime. And writing code is my hobby.

mkb79 commented 3 years ago

@BlindWanderer Hi. Can you help me on this issue?

github-actions[bot] commented 11 months ago

This issue has not been updated for a while and will be closed soon.

github-actions[bot] commented 11 months ago

This issue has automatically been closed due to no activities.