WerWolv / ImHex-Patterns

Hex patterns, include patterns and magic files for the use with the ImHex Hex Editor
https://github.com/WerWolv/ImHex
GNU General Public License v2.0
645 stars 169 forks source link

request: VGM pattern #180

Open ghost opened 11 months ago

ghost commented 11 months ago

Hi

I would like to request a VGM pattern (video game music) You can see the header here

https://vgmrips.net/wiki/VGM_Specification

I'm a total beginner and can't make this myself :( I hope somebody can create this pattern (for latest VGM version)

itsmeow commented 11 months ago

I think a lot of pattern-writers are/were beginners once! I'd suggest taking a look at some other simple patterns to learn how it works and modify it as you go. I was able to make a basic one with no prior experience other than the documentation and the existing pattern files.

paxcut commented 11 months ago

I looked at the specification for VGM and that's a fairly complex format. For what purpose do you need the ImHex pattern?

ghost commented 11 months ago

I looked at the specification for VGM and that's a fairly complex format. For what purpose do you need the ImHex pattern?

Exactly what you say: it's complex and i want to learn it using a pattern for imhex. So that's why i'm interested in a pattern.

WerWolv commented 11 months ago

To me the format looks fairly straight forward. It's just large. My recommendation is to just start slow, maybe write a pattern that just parses the header and some of the data after it. Then once you got more comfortable, implement all the structs whose offsets are stored in the header.

As an example, I'd start like this:

#include <type/magic.pat>

struct Header {
    type::Magic<"Vgm "> ident;
    u32 eofOffset;
    u32 version;
    // ...
};

struct VGM {
    Header header;
}:

VGM vgm @ 0x00;
paxcut commented 11 months ago

The complexity of the format far surpasses the complexity of the concepts needed to write a pattern in general. The absolute best way to learn a complex format is to write the pattern for it so that you learn all the intricacies of how the data is stored.

To me the format looks fairly straight forward.

Emphasis should be placed in the 'To me' part. To me a simple format is one I can read once and have a mental image of it. I couldn't do that with this one. But I second the recommendation. Learning the basics of pattern language is not difficult at all but rather easy instead.

applecuckoo commented 2 months ago

Hey! I've been working on the general layout of this pattern for a few days and I've posted the result here as a GitHub Gist. It's got quite a few problems, so I'm not going to PR it in. Here are a few problems:

paxcut commented 2 months ago

it shouldn't be too hard to implement your own decoding based on what you said. the one the imhex defines works on 8 bits per char and you want to use 4 bits instead. this is the code it uses. can it be modified to handle the particular packed version you want to use?

   struct BCD<auto Digits> {
        u8 bytes[Digits];
    } [[sealed, format_read("type::impl::format_bcd")]];

  fn format_bcd(ref auto bcd) {
            str result;

            for (u32 i = 0, i < sizeof(bcd.bytes), i += 1) {
                u8 byte = bcd.bytes[i];
                if (byte >= 10)
                    return "Invalid";

                result += std::format("{}", byte);
            }

            return result;
        };

I don't understand what the problem is with bits 30 and 31. All you mention is that it is not being read. To read any bit usually one creates a mask to and (&) the value as a way to check, but it also possible to use bitfields in a more direct approach.

As for the last problem I'm not sure what the error message means, there is no variable named data and without a test input file to try to run the pattern it is hard to track down where the error may originate. Can you supply one or suggest a source for it?

applecuckoo commented 2 months ago

@paxcut I've been using this file for testing - you should just be able to get it by clicking the 'Donload' button (no, that's not a typo).

I might just leave the dual-chip and chip selection bits, since they aren't essential to decoding the file.

As for the packed BCD situation, it would probably require a bit of a different approach, since I'd have to use bitfields instead of u8 values. I might come back to this in the future once I'm more comfortable with pattern writing.

applecuckoo commented 3 weeks ago

Finally managed to blast through all of this - the final result is in PR #294. The version detection works now, and that issue with the null strings in the metadata has since been fixed in 8f1f491.