blocksds / sdk

Main BlocksDS SDK repository
https://blocksds.github.io/docs/
164 stars 11 forks source link

Versioning Extensions for the GRF File Format #206

Open Garhoogin opened 1 month ago

Garhoogin commented 1 month ago

I propose here the idea of including file versioning information in the GRF file format to allow for file format extensions.

The idea I had in mind for such a system would be the addition of a new block type with the signature XVER. Decoders may identify the presence of this block to signify that the file is encoded using a newer encoder version, and to be able to either choose to handle or reject such files. My idea for this block is to have a 2-byte format version (high and low version), followed by an optional encoder ID string. The layout would be such:

Offset Type Name
0 u8 Major version
1 u8 Minor version
2 u8 Encoder ID length
3 char[...] Encoder ID
padding to multiple of 4 bytes...

Here the major version should be incremented for any breaking change to the file format that an older decoder should reject. Ideally, this block would come before the HDR block to signal to the decoder early on what format of data it should expect to see. I here propose that the file version to use with the BlocksDS version of the format should be at least 2.0.

The purpose of the Encoder ID would be to identify what software was used to produce the file. Since this is not strictly necessary for a runtime application to decode the file, I've made it optional in this outline. It would be a counted string of up to 255 octets in length, in no particular encoding.

An example for the data block using grit as an example encoder:

00000000:  58 56 45 52 18 00 00 00 02 00 12 67 72 69 74 20   XVER.......grit 
00000010:  76 31 2E 36 2E 30 2D 62 6C 6F 63 6B 73 00 00 00   v1.6.0-blocks...

I believe this would allow the format to be more easily extended, as no such system is currently in place with the format the way it is.

AntonioND commented 1 month ago

I think adding a version chunk makes the format more complicated than it has to be. I think it's enough to have a uint8_t that is incremented with each version that breaks compatibility. This value can be added to the "HDR " chunk, but the current "HDR " chunk isn't the same as the original one pre-BlocksDS, so I can just rename "HDR " to "HDR2".

I wouldn't add information about the tool that has generated the file, it's unnecessary information that makes the binary file grow for no good reason. On the DS we should really aim to be small in this case.

Garhoogin commented 1 month ago

My thought was that if you really wanted small files, you'd probably be using a better format anyways, and a compressor a little more intelligent than grit's. But I can understand where you come from. A simple u8 in the header would probably suffice.

AntonioND commented 1 month ago

A better format like what? GRF looks pretty compact to me, but I may be missing something.

I think I'm just going to switch from "HDR " to "HDRX" and add a version number to the chunk. I'll do this tomorrow, sorry for not being very responsive this week!

asiekierka commented 1 month ago

If you want to preserve backwards compatibility, repurpose the high byte of one of the Attr values:

    uint16_t gfxAttr;  ///< BPP of graphics (or GRFTextureTypes). 0 if not present.
    uint16_t mapAttr;  ///< BPP of map (16 or 8 for affine). 0 if not present.
    uint16_t mmapAttr; ///< BPP of metamap (16). 0 if not present.

If you don't, make the version number at least an uint16_t. However, I am somewhat unconvinced by the idea, as there's no central authority on which version numbers are occupied by which encoders or version formats - any fork of Grit, or even dkP's upstream, could easily disagree with us on that regard. What if two different version 3s appear?

AntonioND commented 1 month ago

@asiekierka That's why I want to use a completely different header chunk ID. If you don't find the ID you're expecting, you're using the wrong version.

AntonioND commented 1 month ago

https://github.com/blocksds/grit/pull/7

https://github.com/blocksds/libnds/pull/131

Draft.

Garhoogin commented 1 month ago

A better format like what? GRF looks pretty compact to me, but I may be missing something.

I think I'm just going to switch from "HDR " to "HDRX" and add a version number to the chunk. I'll do this tomorrow, sorry for not being very responsive this week!

I suppose "better" probably isn't the right word. I meant to say that there's a fair bit of overhead I the format that if someone were worried about file size, they shouldn't be using this. But I do imagine that the whole version block could be a bit much.

@asiekierka That's why I want to use a completely different header chunk ID. If you don't find the ID you're expecting, you're using the wrong version.

I feel different block IDs would do a good job of not confusing someone else's decoder. It's probably not super important that the format be strictly binary compatible, since people can just re-convert their assets when upgrading their tooling.

AntonioND commented 1 month ago

I've just pushed the changes to change the ID to HDRX. Let's go for it for a bit and see if this is enough.

Did you decide what to do about Tex4x4 textures? About whether both chunks are compressed together or not?

Garhoogin commented 1 month ago

I've just pushed the changes to change the ID to HDRX. Let's go for it for a bit and see if this is enough.

Did you decide what to do about Tex4x4 textures? About whether both chunks are compressed together or not?

Gotcha, I'll implement the HDRX change on my end too.

If I was designing the format I would personally compress the palette index data and texel data separately, along with storing the compressed size of texture image data (so the palette index data can be located). In my view, this would make it easier to use without having to decompress both blocks together at once. Then again though, this kind of "ruins" the generality of the GFX block, making the block structure different for 2D and 3D graphics. Alternatively, a new block could be introduced for the palette index data, maybe a PIDX block, which would eliminate this issue I think.I feel like libnds's GL inspired library requiring the data be concatenated, only to separate them again is a bit cumbersome. But that's just my 2 cents on that.

AntonioND commented 1 month ago

I've been thinking a lot about this. The problem with having PIDX as well as GFX is that the logic when loading Tex4x4 images would be more complicated. You'd have to either read the whole file and then check if you have the two chunks, or scan the file for both chunks and then load the file. If we pack both things as part of the GFX chunk, it's a lot easier to check if the data there is correct or not.

I'm not sure how to do this, though. Maybe we can have a uint32_t right at the start of the GFX chunk, which represents the size of the texel data, and then we can have the texel data (compressed or not) and the palette index data (compressed or not)?

To be honest, I'm really not sure about what the best option is here. With my suggestion, at least you can always assume you have the texture data as long as you have a GFX chunk.

However, all of this is meaningless because you also need to check if you have a PAL chunk.

Let's just go with your PIDX idea...