Ryan-rsm-McKenzie / bsa

C++ library for working with the Bethesda archive file format
MIT License
40 stars 6 forks source link

RFC regarding breaking API changes needed to implement Starfield support. #14

Open Ryan-rsm-McKenzie opened 9 months ago

Ryan-rsm-McKenzie commented 9 months ago

With the introduction of the v2 and v3 variants of the .ba2 format in Starfield, the archive format is no longer sufficient to discriminate between data format variants. Most functions now also require a compression_format to handle compression properly, which means that configuring inputs to functions has become overly obnoxious. As a result, I have taken the liberty to convert many functions to take some form of *_param. This offers users the flexibility to override only the fields they need, while leaving others untouched by leveraging designated initializers. Unfortunately, this has the undesired side effect of being the largest API break this library has had thus far. This issue stands as a forum for users to provide feedback on the new API changes: whether they like them/don't like them/think they could be better/etc.

For reference, the .ba2 format is described in detail here.

Ryan-rsm-McKenzie commented 9 months ago

@Guekka Mentioning you on this issue, since I know you're waiting for me to implement Starfield support.

Guekka commented 9 months ago

This looks good to me. I don't mind the API break About the new API, it's good. I think that's exactly why designated initializers were introduced. Are you sure zip should be the default for compression? I would use lz4 personally

Ryan-rsm-McKenzie commented 9 months ago

The defaults are tuned for the first game that introduced the format. lz4 compression was introduced in .ba2 v3 which is currently starfield only. v1 archives are compatible with starfield out of the box, so the defaults work for all games.

It's also not clear if lz4 compression is only available for texture archives, or if it works with general archives too.

Guekka commented 9 months ago

That makes sense

Guekka commented 2 months ago

@Ryan-rsm-McKenzie So, I've finally updated CAO (beta) to use this version. Everything appears to work correctly. There's just one minor oddity: I do not see why chunk::decompress requires a compression_format. In my opinion, it would make more sense for it to be kept as internal state rather than putting the concern on the end user

Guekka commented 2 months ago

Is there any reason for not always using lz4 with starfield?

Guekka commented 2 months ago

Some warnings with g++ (13) bsa.log

Ryan-rsm-McKenzie commented 2 months ago

@Ryan-rsm-McKenzie So, I've finally updated CAO (beta) to use this version. Everything appears to work correctly. There's just one minor oddity: I do not see why chunk::decompress requires a compression_format. In my opinion, it would make more sense for it to be kept as internal state rather than putting the concern on the end user

Individual chunks don't know whether they were compressed using zlib or lz4. This information is stored in the archive header, and you have to forward it along. Storing this information in the chunk could noticeably bloat an archive's in-memory size by several kb. Texture archives would be impacted the worst.

Is there any reason for not always using lz4 with starfield?

Technically the format permits compressing ba2 archives using lz4, but it's still not clear whether the game will actually load them correctly, or whether it assumes all general archives are compressed using zlib. When I last checked, only texture archives were compressed using lz4, and I wouldn't put it past Bethesda to make stupid assumptions like that.

Some warnings with g++ (13)
bsa.log

Fixing this required a breaking change. I had to add an underscore after every conflicting name:

bsa::fo4::archive::meta_info{
  .format = bsa::fo4::format::general,
  .version = bsa::fo4::version::v1,
  .compression_format = bsa::fo4::compression_format::zip,
};
// ^^^ BEFORE / AFTER vvv
bsa::fo4::archive::meta_info{
  .format_ = bsa::fo4::format::general,
  .version_ = bsa::fo4::version::v1,
  .compression_format_ = bsa::fo4::compression_format::zip,
};
Guekka commented 2 months ago

Individual chunks don't know whether they were compressed using zlib or lz4. This information is stored in the archive header, and you have to forward it along. Storing this information in the chunk could noticeably bloat an archive's in-memory size by several kb. Texture archives would be impacted the worst.

Do we really care about a few kilobytes in the context of loading big archives? Even when loading 100'000 files, it would be something like 3 megabytes overhead at worst. In my opinion, the convenience outweights the overhead. I understand your point though, some library users might prefer the current way

Technically the format permits compressing ba2 archives using lz4, but it's still not clear whether the game will actually load them correctly, or whether it assumes all general archives are compressed using zlib. When I last checked, only texture archives were compressed using lz4, and I wouldn't put it past Bethesda to make stupid assumptions like that.

That unfortunately makes sense, coming from Bethesda. Thanks

Guekka commented 3 weeks ago

@Ryan-rsm-McKenzie I have now released CAO7 beta with starfield support. I think this branch can be merged 👍