python-intelhex / intelhex

Python IntelHex library
BSD 3-Clause "New" or "Revised" License
201 stars 107 forks source link

Add option to IntelHex.segments to split segments on alignment boundaries #21

Open sTywin opened 6 years ago

sTywin commented 6 years ago

In addition to finding contiguous occupied data addresses, it is often useful to be able to further split these segments based on an integer alignment, such as a flash page size or other block size. An optional alignment parameter is added to IntelHex.segments which allows any segments that span an integer multiple boundary of the alignment to be split into multiple sub-segments. The semantics of the returned list of ordered tuple objects is unchanged; that is, regardless of the given alignment parameter, all addresses will be traversed as contiguous segments.

I have also extended the test_segments portion to add support for testing the alignment parameter without changing any of the existing tests.

bialix commented 6 years ago

It will be nice to have some explanation or examples in tests or in main code, or here, what you want to achieve with alignment. My understaing so far you want your segment to start right at the alignment boundary, say we have boundary 4, so segments should be 0-3, 4-7, 8-11 and so on. Is it correct?

Please, make a seprate test method (or several) for your new functionality.

Also, if possible, rewrite the following statement in simpler form - it's hard to parse it for me personnaly. Probably my Python-foo is not strong enough :-/

return [(a, b+1) for (x, y) in zip(beginings, endings) for (a, b) in _align(x, y, alignment)]

Also, it would be simpler to test and support new code if you extract inner function def _align(start, end, align): to module/class level and provide couple of tests just for it.

Ideally, I'd like to see your new tests checks cover some edge cases when segement is already aligned or not, spans several alignment blocks.

Thank you. I hope that's not too much to ask you.

bialix commented 6 years ago

Also I find it simpler if you test entire return value from segments() method, rather than 10 separate checks which hide the entire picture. I can't see a forest behind those trees :-/ Please use shorter data array for this, no need for 1Mbyte blob.

sTywin commented 6 years ago

My understaing so far you want your segment to start right at the alignment boundary, say we have boundary 4, so segments should be 0-3, 4-7, 8-11 and so on. Is it correct?

You got it. I basically want the ability to get a list of occupied memory segments that don't span alignment boundaries, aka block boundaries or page boundaries, whatever your application may be. For me, it's flash pages on a microcontroller.

Please, make a seprate test method (or several) for your new functionality.

Done. I split out the test portions with the alignment parameter set, and created a separate test class for the helper function entirely.

Also, if possible, rewrite the following statement in simpler form - it's hard to parse it for me personnaly. Probably my Python-foo is not strong enough :-/

The only clean way I can see to re-write it is to collapse the generator expressions into actual in-memory lists. As written now, only the final list comprehension is stored in memory. We can do this if you like, but it's basically just a double loop to get the subsegments of each segment, and also to add 1 to the end value (to make it exclusive instead of inclusive):

for segment in self.segments():
    for subsegment in align(segment, alignment):
        output += [ (subsegment.start, subsegment.end+1) ]

Also, it would be simpler to test and support new code if you extract inner function def _align(start, end, align): to module/class level and provide couple of tests just for it.

Done; renamed it to _align_segment(start, end, alignment) to give it some context, and added a docstring.

Ideally, I'd like to see your new tests checks cover some edge cases when segement is already aligned or not, spans several alignment blocks.

Done. I think they are fairly comprehensive now.

Also I find it simpler if you test entire return value from segments() method, rather than 10 separate checks which hide the entire picture. I can't see a forest behind those trees :-/

I actually agree, I was just trying to use the same format as the existing test. I've re-written my test cases. I don't mind either way if you want to consolidate or split up the individual tests.

bialix commented 6 years ago

Just a note about principal inefficiency of segements method: it relies on addresses method which returns a full list of all addresses in IntelHex object. So, if you want to optimize - you need to start much deeper. (Hint: storage of data).

sTywin commented 6 years ago

Sure, but I didn't set out to re-write segments, just avoid adding inefficiencies that weren't there before :) I don't feel strongly either way; I'll give you both versions. I personally find the double-loop list comprehension fairly straightforward.

That said, I was poking around and found an issue with recursion depth for long segments being split on short alignments. I'll push an update probably tomorrow that unrolls the recursion and just iterates instead.

sTywin commented 6 years ago

I've unrolled the recursion into an iteration. No noticeable change in speed. I've also added a stress-test for _align_segment that easily made the recursive approach fall over, but gives the iterative approach no problems whatsoever.

sTywin commented 6 years ago

Here are a few different options for the final return statement.

Option 1 (current; nested list comprehension):

return [(a, b+1) for (x, y) in zip(beginings, endings) for (a, b) in _align_segment(x, y, alignment)]

Option 2 (verbose; explicit double loop with append):

subsegments = []
for (x, y) in zip(beginings, endings):
    for(a, b) in _align_segment(x, y, alignment):
        subsegments += [ (a, b+1) ]
return subsegments

Option 3 (itertools.chain):

seg_generators = [ _align_segment(x, y, alignment) for (x, y) in zip(beginings, endings) ]
return [ (a, b+1) for (a, b) in itertools.chain(*seg_generators) ]
sTywin commented 6 years ago

Option 4: same list comprehension as option 1, but with line breaks to show double-loop format of option 2:

return [(a, b+1)
        for (x, y) in zip(beginings, endings)
            for (a, b) in _align_segment(x, y, alignment)]
bialix commented 5 years ago

Sorry for not working on your pull request. I'm looking for a new maintainer for Python IntelHex project. I hope someone will help.