Open GoogleCodeExporter opened 9 years ago
We are having performance issues related to a prior developer choosing
bitstring for an embedded application.
This is worsened by the use of the dictionary format string and value storage.
Measurement shows that the bitstring and byte conversion is taking too long,
the bitstring.pack line below:
reg_data += bitstring.pack(fpga.rf_ctrl_fmt, **fpga_reg_to_write_dict).bytes
Our dictionary is on the order of 60 elements, placed into roughly 20
byte-registers.
This enhancement, of pre-compiled format strings, would really help us, as I
strongly suspect that the format string parsing is the heavy work.
As the usage model is typically to change only 1-3 dictionary values and then
regenerate the register set, I considered whether a new method to keep a
'prev_bits' bitstring around. We'd then do a bitstring.update(prev_bits,
fpga.rf_ctrl, **update_dict), where update_dict would have only the dictionary
entries that needed to be changed. But, that would still require that the
format string be parsed, and walked.
Any other suggestion as to a modified usage model that would execute faster
with bitstring's existing functionality? Using the module really accelerated
handling changes such as register fields moving, etc., keeping that separate
from the bitfield content control.
I apologize also if this isn't the right forum to ask questions; I didn't see a
mailing list or discussion forum.
Thanks,
Eric
Original comment by mons...@gmail.com
on 25 Jul 2013 at 5:59
Hi, this is as good a forum as any (or just email me). I'll try to take a look
at how easy the pre-compiled format strings is to do at the weekend. It might
be possible to hack something together quite easily.
Other things to consider:
Using += to make reg_data could be rather inefficient (it has to create a new
reg_data for each iteration). The usual method is to append to a list and use
''.join(the_list).
It might also be fairly straightforward to compile bitstring to a C module via
Cython. I've experimented with this in the past and had all the unit-tests
passing and about a 2x speed up on average. Tell me if that's of interest to
you.
Cheers,
Scott
Original comment by dr.scott...@gmail.com
on 25 Jul 2013 at 8:26
Scott,
Thanks for your response!
We only do the += once. The line before the pack is:
reg_data = ''
which I don't understand, and then doing a +=.
(But then, I'm a novice at python, being the Sys Eng on the project and neither
the departed nor replacement coder.)
May I ask: Have you done benchmarking that established that the format parsing
was the bulk of the time spent?
The presence of this enhancement item, and my own embedded experience with the
expense of printf-type constructs, made me think the format string was the
expensive part. A senior developer here suggested that there might well be a
re-allocation inside a loop in bitstring that was taking most of the time, as
he thought format parsing would be something Python did quickly.
Either the format pre-compiling, or Cython, would be of interest to us. A 2x
improvement gets us within spec.
Thanks,
Eric
Original comment by mons...@gmail.com
on 25 Jul 2013 at 9:43
I've done a few tests on a pack similar to one I think you're using. For me the
time was mostly being spent constructing the bitstring, and the format parsing
was only a few percent.
I've committed a minor change which got my test down from 3.4ms to 2.4ms, so I
suspect it'll be noticeable to you. Try r.982
(https://python-bitstring.googlecode.com/svn-history/r982/trunk/bitstring.py) -
just replace the current bitstring.py with the new one.
Another rather obvious thing is to make sure you're using -O when running
Python - bitstring does have a fair amount of asserts in it so -O makes a big
difference unlike in many modules.
I couldn't get Cython to work in my brief test. It should be as simple as
uncommenting the Cython stuff in setup.py and then 'python setup.py build_ext
--inplace'. I tried reverting to an old revision which I know did work and that
failed in the same way, so it's probably something stupid with my setup and it
might work for you. I may take another look if I get the chance.
Original comment by dr.scott...@gmail.com
on 28 Jul 2013 at 3:59
... got the Cython working again. Forgot that I needed to rename bitstring.py
to bitstring.pyx first!
My test is now down to 1.6 ms, so that's a 2x speedup overall. YMMV, so let me
know how you get on.
Original comment by dr.scott...@gmail.com
on 28 Jul 2013 at 5:57
Some benchmarking updates:
Our timing markers are not exactly placed to isolate bitstring, and the bit
string we're using is changing slightly, so these are a little fuzzy, but:
95 ms : Original timing, of which commenting out bitstring.pack had previously
saved ~77ms.
81 ms : Running with -OO, so clearly the asserts were significant, perhaps
14/77 or ~18% faster
68 ms : Running your updated bitstring.py, but not -O, seems to be ~35% faster.
63 ms : Both the new bitstring.py and -OO, total of ~42% faster with the two
changes.
I'm not surprised that saving 14 ms and saving 27 ms combined to save less then
41ms.
Overall, a big savings from those two changes.
We don't have Cython installed on that embedded SBC, so testing Cython's effect
will have to wait a bit.
Thank you very much for the advice and changes, I wanted to let you know that
it definitely helped!
Original comment by mons...@gmail.com
on 30 Jul 2013 at 10:16
Original issue reported on code.google.com by
dr.scott...@gmail.com
on 20 Jul 2011 at 9:17