kutsurak / python-bitstring

Automatically exported from code.google.com/p/python-bitstring
0 stars 0 forks source link

Implment rcut and rjoin methods for Bits objects #143

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I find myself needing to cut and join bitstrings in the opposite order to the 
default for cut and join. I propose the addition of rcut and rjoin in the same 
fashion as rfind to the Bits class.

There implementation is nearly identical to the existing methods, I have marked 
line with "M" where they differ from cut/join:

M   def rcut(self, bits, start=None, end=None, count=None):
        start, end = self._validate_slice(start, end)
        if count is not None and count < 0:
            raise ValueError("Cannot cut - count must be >= 0.")
        if bits <= 0:
            raise ValueError("Cannot cut - bits must be >= 0.")
        c = 0
        while count is None or c < count:
            c += 1
M           nextchunk = self._slice(max(end - bits, start), end)
            if nextchunk.len != bits:
                return
            assert nextchunk._assertsanity()
            yield nextchunk
M           end -= bits
        return

M   def rjoin(self, sequence):
        s = self.__class__()
        i = iter(sequence)
        try:
M           s._prepend(Bits(next(i)))
            while True:
                n = next(i)
M               s._prepend(self)
M               s._prepend(Bits(n))
        except StopIteration:
            pass
        return s

Thanks
Phil

Original issue reported on code.google.com by pmountif...@formac.net on 3 Jul 2014 at 9:44

GoogleCodeExporter commented 9 years ago
Hi, thanks for the suggestion. In general I'm never keen on adding new methods 
if I can help it so you will need to do some persuasion (although of course you 
already have your own local mod).

As far as I can tell the proposed

    b = Bits().rjoin(a)

is almost equivalent to

    b = Bits().join(reversed(a))

which just uses the in-built reversed iterator. I guess this does assume that 
'a' is available in memory and isn't itself lazily generated, but I think it 
would satisfy most needs.

The rcut is more difficult to do, especially if you aren't going to store all 
the results, but I'm not convinced that it justifies a new method.

For example, instead of 

    [a.int for a in b.cut(8)]

you can write

    [b[x:x+8].int for x in range(0, b.len, 8)]

So therefore

    [a.int for a in b.rcut(8)]

is almost the same as

    [b[x-8:x].int for x in range(b.len, 0, -8)]

which doesn't seem quite nasty enough to justify a new method. Thoughts?

Original comment by dr.scott...@gmail.com on 7 Jul 2014 at 2:08

GoogleCodeExporter commented 9 years ago
Hi, thanks for getting back to me.

Regarding rjoin: the primary motivation for this is where you do *not* have 
something which is reversible, e.g. a generator function as opposed to a 
sequence. Originally I was trying to use reversed and/or [::-1] in various 
locations to get the desired effect but often found that I had do do silly 
things like Bits().join(list(reversed(foo))) just to make it work, I haven't 
found a way to build it without having to first build another sequence in 
memory only then to reverse it! Not efficient for large sequences etc...

Regarding rcut: subjectively I would say that "[b[x-8:x].int for x in 
range(b.len, 0, -8)]" is nearly unreadable relatively speaking compared to 
"[a.int for a in b.rcut(8)]" especially if you had to do this a large number of 
times in the code, or when you're using it as part of another generator 
expression or list comprehension etc...

When dealing with binary data that is from external sources and is encoded in 
various different ways I have found it frequently necessary to be able to start 
operations from either end of the bitstring without prejudice. Currently it 
feels like going against the grain and you end up with reversed, list, [::-1] 
and other bits and pieces littered through your code when it could easily be 
abstracted up to the library layer, which of course makes the library more 
application agnostic.

Initially I did implement all the things I needed without these two methods, 
but after a while I thought "what's the point", so monkey patched in rjoin and 
rcut, then proceeded to go through my code deleting all the cruft and 
everything was so much more readable and statements would fit on a single 
PEP8-complient line.

Have I convinced you?

Thanks!

Phil

Original comment by pmountif...@formac.net on 7 Jul 2014 at 2:46