flying-sheep / bcode

bencoding & -decoding library
http://flying-sheep.github.com/bcode/
MIT License
26 stars 5 forks source link

Is there a way to disable decoding bytes into a string? #9

Open xinhuang opened 6 years ago

xinhuang commented 6 years ago

There are fields that are bytes encoded as string, like token, peer id, node id, etc. But actually they are better treated as bytes instead of str. And the decode of these values are just a guess. See bcoding.py#L76. It's better to handle a certain type rather than do a check of type before dealing with the data.

So, is there a way to disable the decoding?

def _decode_buffer(f):
    """
    String types are normal (byte)strings
    starting with an integer followed by ':'
    which designates the string’s length.

    Since there’s no way to specify the byte type
    in bencoded files, we have to guess
    """
    strlen = int(_readuntil(f, _TYPE_SEP))
    buf = f.read(strlen)
    if not len(buf) == strlen:
        raise ValueError(
            'string expected to be {} bytes long but the file ended after {} bytes'
            .format(strlen, len(buf)))
    try:
        return buf.decode()
    except UnicodeDecodeError:
        return buf
flying-sheep commented 6 years ago

sure, we could add a parameter to bdecode that gets passed down all the way through the _decode_buffer function, like

def bdecode(f_or_data, try_decode=True):
    ...
    ... _decode_buffer(f, try_decode)
    ...

care to do a PR?

xinhuang commented 6 years ago

Okay. Just need sometime before I can work on this.

BTW: How do you like create a custom exception class instead of ValueError and TypeError? Sometime need a way to tell if it's some peers sending garbage, but occationally catching ValueError and TypeError mixed with other exceptions.

flying-sheep commented 6 years ago

sounds good!

subclassing TypeError and ValueError would allow both fine-grained exception handling and still catching the more generic ones.

xinhuang commented 6 years ago

Great. I will make a change about the 2 things we discussed. Can't promise the time, but will do ASAP.