eerimoq / bincopy

Mangling of various file formats that conveys binary information (Motorola S-Record, Intel HEX, TI-TXT, Verilog VMEM, ELF and binary files).
MIT License
109 stars 38 forks source link

Feature request: Support Microchip HEX format #38

Closed bessman closed 11 months ago

bessman commented 11 months ago

The Microchip HEX format is identical to the Intel format, except the addresses in the HEX file are twice the actual machine addresses. This is because Microchip's PIC architecture typically has wider instructions than its machine word. For example, the PIC24 uses 16-bit words but 24-bit instructions.

This can be handled by parsing the file as Intel format with a word size of one byte, and then changing the word size to two bytes after parsing is done.

I can make a PR if you think this would be a useful feature. It could be enabled with an optional microchip=False argument to BinFile.

eerimoq commented 11 months ago

Can the format be handed without a flag to Bincopy? I prefer to have format logic in add*() methods.

bessman commented 11 months ago

It could be done like this:

def add_microchip_hex(self, records, overwrite=False):
    self.word_size_bytes = 1
    self.add_ihex(records, overwrite)
    self.word_size_bytes = 2
    self.segments.word_size_bytes = 2
    for segment in self.segments:
        segment.word_size_bytes = 2

def add_microchip_hex_file(self, filename, overwrite=False):
    with open(filename, 'r') as fin:
        self.add_microchip_hex(fin.read(), overwrite)

However, since Microchip's format is indistinguishable[^1][^2] from Intel, users would have to directly call add_microchip_hex_file:

import bincopy
# binfile = BinFile("microchip.hex")  # Won't work.
binfile = BinFile()
binfile.add_microchip_hex_file("microchip.hex")

[^1]: In practice, I think every fourth byte in a Microchip HEX is guaranteed to be zero (even though the example given in Microchip's format definition contradicts this), which could in theory be used to determine if a file is a normal Intel HEX or a Microchip HEX. Seems a little fragile to rely on that, since it's not explicit in the format definition. [^2]: Scratch that, that is only the case for 16-bit word / 24-bit instruction. For PIC18 (8-bit word / 16-bit instruction) there is no such pattern. I think the formats are truly indistinguishable.

eerimoq commented 11 months ago

Looks like a good plan.