decode_effect.py fails to parse CRN_CHO.ZD2 (MS-70CDR+ effect file) due to non-ascii character in TXE1 chunk

thammer commented 3 months ago

When I run python3 ./decode_effect.py --dump CRN_CHO.ZD2, the command fails with the following error:

construct.core.StringError: cannot use encoding 'ascii' to decode b'This is a model of tc electronic\x81fs CORONA CHORUS.\r\n\r\n'

Here's the corresponding hex values for the TXE1 chunk in the CRN_CHO.ZD2 file:

54 58 45 31 36 00 00 00 54 68 69 73 20 69 73 20 61 20 6D 6F 64 65 6C 20 6F 66 20 74 63 20 65 6C 65 63 74 72 6F 6E 69 63 81 66 73 20 43 4F 52 4F 4E 41 20 43 48 4F 52 55 53 2E 0D 0A 0D 0A

There is a 0x81 byte in the chunk, which trips up the Construct library decoding, as this is not a valid ascii character (as required in zoomzt2.py):

TXE1 = Struct(
    Const(b"TXE1"),
    "length" / Int32ul,
    "description" / PaddedString(this.length, "ascii"),
)

I have tried specifying other encodings, as described in the Construct library docs (https://construct.readthedocs.io/en/latest/api/strings.html#construct.core.possiblestringencodings), but none of them work.

I have googled extended ascii tables and other encoding standards to see if I could find one where 0x81 is an apostrophe, without success.

Also, the next character "f" (0x66) is also a bit of a mystery. Is it just a typo, or is the 2-byte encoding 0x81 0x66 supposed to represent an apostrophe?

I haven't come up with a good solution for this. One could perhaps read the TXE1 chunk as an array of bytes (like what is done for the TXJ1 chunk), then replace 0x81 0x66 with an apostrophe, and then convert the data to a string, but that seems like quite a hack, and I don't know how to make the Construct library to that for us, so it would probably result in some slightly ugly code.

mungewell commented 2 months ago

Seeing the same issue with MS-60B+ effects

-rw-rw-r-- 1 simon simon     0 Sep  6 20:40 B_MOCT.ZD2.txt
-rw-rw-r-- 1 simon simon     0 Sep  6 20:40 B_OCTAVE.ZD2.txt
-rw-rw-r-- 1 simon simon     0 Sep  6 20:40 B_PITCH.ZD2.txt
-rw-rw-r-- 1 simon simon     0 Sep  6 20:40 B_PLY_LT.ZD2.txt
-rw-rw-r-- 1 simon simon     0 Sep  6 20:40 B_PLYOCT.ZD2.txt
-rw-rw-r-- 1 simon simon     0 Sep  6 20:40 HPS.ZD2.txt

mungewell commented 2 months ago

MS-60B+ where group name issue, not quiet the same. Fixed that though....

Not 100% sure how to screen for a 0x81 byte. You can perhaps use an 'Adapter' where you define how to encode/decode. I used these in another project to handle the fact that values in midi are 7-bit

class Midi2u(Adapter):
    def _decode(self, obj, context, path):                                
        return((obj & 0x7f) + ((obj & 0x7f00) >> 1))
    def _encode(self, obj, context, path):
        return((obj & 0x7f) + ((obj & 0x3f80) << 1))

mungewell commented 2 months ago

It could also be a possibility that the file is 'just corrupt', I've known the weirdest things happen....

mungewell commented 2 months ago

I stumbled on 'FixedSized' and it gives me None when parsing fails, can you do something like??

TXE1 = Struct(
    Const(b"TXE1"),
    "length" / Int32ul,
    "peekdescription" / Peek(FixedSized(this.length, PaddedString(this.length, "ascii"))),
    "description" /  IfThenElse(lambda ctx: ctx.peekdescription == None,
         "description" / Bytes(this.length),
         "description" / PaddedString(this.length, "ascii"),
    ),
)

mungewell / zoom-zt2

decode_effect.py fails to parse CRN_CHO.ZD2 (MS-70CDR+ effect file) due to non-ascii character in TXE1 chunk #76