eerimoq / bincopy

Mangling of various file formats that conveys binary information (Motorola S-Record, Intel HEX, TI-TXT, Verilog VMEM, ELF and binary files).
MIT License
109 stars 38 forks source link

Add set_binary method, allowing to patch data in existing segments #5

Closed toesus closed 8 years ago

toesus commented 8 years ago

I found this library very useful, however i missed the ability to patch data into existing files, at a given address.

Therefore i added the function set_binary() which allows to modify existing data. To simplify the implementation (and because i do not see a useful use-case) modifying data is only allowed inside existing segments.

Example usage:

sfile = bincopy.BinFile()
sfile.add_binary('hello world', address=1024)
sfile.set_binary('srec!', address=1030)
print sfile.as_hexdump()

produces: 00000400 68 65 6c 6c 6f 20 73 72 65 63 21 |hello srec! |

coveralls commented 8 years ago

Coverage Status

Coverage decreased (-0.9%) to 90.3% when pulling 625a7f4e54dc6b3d2959bb2b9707cf3105307583 on toesus:develop/set_data into 20a4a5fdd05888987b530050cb9e96cd1f58dcd1 on eerimoq:master.

coveralls commented 8 years ago

Coverage Status

Coverage increased (+0.4%) to 91.686% when pulling 94de7c8c109602743a9c1736e3013008bde5c16a on toesus:develop/set_data into 20a4a5fdd05888987b530050cb9e96cd1f58dcd1 on eerimoq:master.

eerimoq commented 8 years ago

Nice idea, but I think the implementation can be simplified to:

def set_binary(self, address, data):
    self.exclude(address, address + len(data))
    self.add_binary(data, address=address)

It's possible to set data inbetween segments as well using this implementation.

An alternative interface is to add an optional argument to the add_xxx() functions.

def add_xxx(.., overwrite=False)

I prefer this alternative over adding a new function.

What do you think?

toesus commented 8 years ago

Oh, I did not think of that, but it's much more elegant of course... I'm just worried about performance when patching lots of small amounts of data into larger segments (which would be my usecase)

I just did a short check similar to the test_performance() testcase. When i try to change each byte individually like this:

        chunk = 1024 * b"1"
        for i in range(1024):
            binfile.add_binary(chunk, 1024 * i)

        for i in range(1024 * 10):
            binfile.set_binary(b'2',address=i)

It takes about 7.5s with your proposal, but it completes in about 30ms for my implementation.

Maybe we could treat 2 cases: if the new data is completely inside the existing segment, we directly edit the segment-data. Only if the new data is partially-overlapping with a segment then we can do the exclude() + add_binary() method.

eerimoq commented 8 years ago

Ohh wow I didn't expect the performance to be that much worse with the exclude()+add_binary() implementation. That's not acceptable.

Yeah, directly modifying the segment when possible and otherwise use exclude()+add_binary() sound like a good solution.

I started thinking about allowing overwriting data in a similar way for srec and ihex, but maybe that is overkill.

What about adding an optional argument to add_binary(..., overwrite=False) instead of a new function? It's a question about taste, the toughest interface descisions of them all =)

eerimoq commented 8 years ago

I couldn't resist to add the optional overwrite argument to all add-functions. If overwrite is set to True any data already present at given address is overwritten, modifying the existing segment for good performance. The script you wrote in a comment earlier finished in 50 ms.

The implementation is pushed to this repository and released on PyPi, version 7.1.0.

toesus commented 8 years ago

Cool! I just tested your latest version and it works great. I think it was a good decision to keep the interfaces backwards-compatible...

Thank you for adding this feature so quickly, i will close this pull request.

eerimoq commented 8 years ago

No problem.

I'm glad you find the package useful!