BugSplat-Git / macho-uuid

0 stars 0 forks source link

Are we reading the UUID correctly? #7

Open bobbyg603 opened 9 months ago

bobbyg603 commented 9 months ago

https://github.com/BugSplat-Git/macho-uuid/blob/c0d00f6a5e07f414f2af26169c0da351d437cdb1/src/macho.ts#L127-L136

We might need to read either LE or BE depending on the magic sequence.

csmith0651 commented 9 months ago

It's a 16 byte value. In 4, 2, 2, and 8 byte chunks. From what I've seen the 8 byte chunk is read raw, but the other sections might have endianness issues. For instance look at this ELFSharp Code:

(https://github.com/konrad-kruczynski/elfsharp/blob/0f06793c31d9c2e431337e67d2abf2d87745995e/ELFSharp/MachO/UUID.cs#L26)

        private Guid ReadUUID()
        {
            var rawBytes = Reader.ReadBytes(16).ToArray();

            // Deal here with UUID endianess. Switch scheme is 4(r)-2(r)-2(r)-8(o)
            // where r is reverse, o is original order.
            Array.Reverse(rawBytes, 0, 4);
            Array.Reverse(rawBytes, 4, 2);
            Array.Reverse(rawBytes, 6, 2);

            var guid = new Guid(rawBytes);
            return guid;
        }

Reader is a simple Endian reader. But, what's confusing to me a little bit, is why the code is blindly reverseing 0-3, 4-5, and 6-7 bytes rather than consulting the endianness of the reader?

Here's what chatgpt wrote (which doesn't help much I think):

 Yes, the bytes in the UUID stored within the `LC_UUID` command in a Mach-O file are typically presented in reverse order compared to the more commonly seen human-readable representation.

The UUID is a 128-bit (16-byte) value, usually represented as a string of 32 hexadecimal characters grouped in five sections separated by hyphens. For example, a typical UUID might look like this: `123e4567-e89b-12d3-a456-426655440000`.

When stored in the `LC_UUID` command within a Mach-O file, the bytes of the UUID are usually stored in little-endian order. This means that the byte order is reversed from how it's commonly displayed in a human-readable format. So, if you were to extract the UUID bytes directly from the Mach-O file, you'd find them in reverse order compared to the standard UUID representation.

For instance, using the previous example UUID (`123e4567-e89b-12d3-a456-426655440000`), the byte order in the Mach-O file would be reversed. However, when read and displayed as a UUID by a tool like `otool`, it will typically reverse the byte order to present it in the standard human-readable format.

It's important to note this byte order reversal is a common representation in the context of storage or encoding within files and memory, especially in little-endian systems, but the order may vary depending on the specific file format or system implementation. Always refer to the specifications or documentation related to the file format or tool you're working with for precise details on byte order and data representation.