DaemonEngine / crunch

Advanced DXTc texture compression and transcoding library and tool, upgraded with Unity improvements, added features, extended system and hardware support, deeply tested.
https://github.com/DaemonEngine/crunch
Other
16 stars 6 forks source link

Support horizontally flipped TGA images #68

Closed illwieckz closed 4 months ago

illwieckz commented 4 months ago

Support horizontally flipped TGA images.

The TGA format is a bottom-left format by default: the first byte of the first column is expected to be displayed at the bottom-left of the screen, this behavior is inherited from devices painting the screen from the bottom left to the top right.

The TGA format provides two bits in the image descriptor byte to paint the data from other sides. The 4th bit tells the device to paint the screen from right-to left and the 5th bit tells the device to paint the screen from top to bottom.

So basically:

Previously stb_image only read the 5th bit and then only supported the loading of vertically flipped images, stb_image was ignoring the 4th bit coding the horizontal flip. Now both flipping directions are supported for both raw and RLE storage.

illwieckz commented 4 months ago

This is tested against existing raw TGA test files already provided in the repository, and additional RLE TGA test files provided with the PR.

The raw TGA ones are from https://gitlab.com/illwieckz/investigating-tga-orientation-in-imagemagick

They are generated with a script written for the only purpose of generating those reference images for debugging image libraries and tools.

The RLE TGA images were converted from the raw TGA ones using GIMP and some tweaks, as the GIMP exporter only supports bottom-left and top-left:

illwieckz commented 4 months ago

We are in need for a reliable tool we can recommend that is able to convert TGA to PNG, as TGA is the de-facto legacy idTech3 lossless image format and we prefer to store PNG or WebP images in repository (and cwebp expects PNG as input), so we need to reliably convert TGA files to PNG.

ImageMagick convert is not reliable, it is known to be buggy for years, it is also well known that there is no way to know if a given ImageMagick convert version is affected or not:

The bug is considered to not be a bug by the maintainer despite countless reports and a complete study having been published with an alternate from-scratch implementation written against the specifications to compare:

For reference, here are the specifications:

The crunch repository being a required dependency of the engine because of the inc/crn_decomp.h transcoder header library, the CRN format being the recommended image formats for the Unvanquished releases, and the Dæmon crunch tool being the recommended tool to be used in the Unvanquished production pipeline, the crunch tool is the best candidate for a recommended tool for converting TGA files to PNG, as we already maintain it and make sure it works and is available to contributors.

illwieckz commented 4 months ago

I plan to submit the stb_image changes to upstream.

illwieckz commented 4 months ago

I added some checks of what crunch produces, it confirmed i686 without SSE2 doesn't produce same file than all other tested platforms, as discussed there:

When passing the -msse2 flag the checksum is the same as the one computed on other platforms.

illwieckz commented 4 months ago

The CI now disables fast math (and prefer SSE over x87) to make produced CRN reproductible enough, otherwise there is no easy way to make a CI checking the CRN output of crunch.

slipher commented 4 months ago

The TGA parsing code seems OK.

I'm skeptical of the testing methodology with a bunch of hard-coded hashes of expected results. It would make a simple change like https://github.com/DaemonEngine/crunch/pull/35 quite burdensome. I wonder what these tests are intended to verify:

illwieckz commented 4 months ago

I wonder what these tests are intended to verify:

* Is it intended as a change detector, so that you don't unintentionally change the results when you are trying to make a refactoring commit?

* Is it supposed to check that results are "reproducible" between compilers?

It is intended as a change detector, to make sure our code or libraries we may update don't break compatibility. I got vaccinated from the ImageMagick fiasco, I know well how image format compatibility may go wrong in the back of people.

It is not supposed to check that results are reproducible between compilers, but a good-enough level of reproducibility helps to have a single known checksum for every compiler/system/platform. Reproducibility makes the checking process easier and far less fragile.

One thing this also does is to check that multiple storage variants of the same image source (like the TGA orientation collection) produces the same results. For that we may not need an hardcoded checksum, we can just check that all the images from the same collection produces the same file, but an hardcoded checksum also works well.

I'm skeptical of the testing methodology with a bunch of hard-coded hashes of expected results.

I'm also not very happy with it, for example I would prefer to load the file, decode the RGBA stream and hash the RGBA stream itself, not the whole file.

This would be doable for PNG/Jpeg/TGA/BMP with Python Pillow but even Python Pillow cannot be trusted with PNG, and for CRN/DDS this is not doable without using Crunch itself to convert back to something Pillow supports (like TGA) first, and we would need to not only extract the full-size image, but also all the mipmaps.

Bugs like this may come from bad mipmaps:

Checking the whole file is currently the simpler implementation for checking the whole.

Having a more complex code able to validate the RGBA streams themselves would indeed avoid to fail the checksum because of LZMA producing a different compressed bytestream, or when updating some metadata like in #35.

slipher commented 4 months ago

I think we could make a change detector without having to hard-code hashes. What if for pull requests we also build the crunch tool at the last common commit between master and the pull request? Then the two files could be compared.

illwieckz commented 4 months ago

It would double the CI build time, I prefer to update the checksum when it is needed than to wait twice the time on every pull requests.

What we may implement is a simple checksum file we can regenerate in one go.

illwieckz commented 4 months ago

So, I implemented the idea of dumping a checksum file and check against it.

It happened that by doing so I was able to test the checksum for all the files, and then I discovered that “jpeg to dds” is not reproducible. When running a GCC build on Ubuntu 24.04 or on 22.04 chroot on 24.04 on my own Zen 2 amd64, I don't get the same result for jpg to dds that I get with GCC on Ubuntu 22.04 on Azure amd64… I don't know why yet but I give up on that for now.

So for now the exhaustive file check is disabled, what is still done is the clone check: for every command that should produce the same file (like png to crn and bmp to crn, or bottom-left tga to crn and top-right tga to crn), it is checked that all the clones produce the same checksum. This is enough to verify that image orientation is properly implemented.

illwieckz commented 4 months ago

So I discovered that GCC defaults on some -std=gnu*, and when not using -std=c*, GCC uses -ffp-contract=fast even when -ffast-math is disabled, so GCC uses -ffp-contract=fast by default… This is what made the generated files not reproducible.

Now CMake also disables FP contraction whith USE_FAST_MATH=OFF.

I enabled the file checksuming, it now works everywhere.

If needed (for example when committing a change changing some format on purpose), the checksum database can be re-generated this way:

test/test.py --record
illwieckz commented 4 months ago

Hmm, I still have a mistmatch on jpg-to-dds, converting unvanquished_64.jpg to unvanquished_64.dds produces this on all CI (Azure/Appveyor/MSVC/GCC/MinGW/Clang/AppleClang/Windows/Linux/macOS/amd64/arm64/i686…):

But on my end I got this whatever the compiler:

What's weird is that it's always one on CI, and always another one on my end.

illwieckz commented 4 months ago

The only 8 bytes differences in a 2872 bytes file:

20240717-003242-000 vbindiff-jpg-to-dds

illwieckz commented 4 months ago

So I disabled exhaustive file checking again, I only left clone checking enabled.

More research may be done in another time in another PR, there is no need to postpone merging what is already working.

illwieckz commented 4 months ago

I finally reproduced the jpg-to-dds mismatch!

No  c1b3b785d1 build/test/jpg-to-all/unvanquished_64.dds

I thought that such mismatch was insane:

Host System Compiler Architecture blake2 sum
🌍️ Appveyor Windows MSVC 16 amd64 ✅️ c1b3b785d1
🌍️ Appveyor Windows MSVC 16 i686 ✅️ c1b3b785d1
🌍️ Azure Ubuntu 20.04 GCC 8 amd64 ✅️ c1b3b785d1
🌍️ Azure Ubuntu 20.04 Clang 11 amd4 ✅️ c1b3b785d1
🌍️ Azure Ubuntu 20.04 GCC 9 i686 ✅️ c1b3b785d1
🌍️ Azure Ubuntu 20.04 GCC 9 arm64 ✅️ c1b3b785d1
🌍️ Azure Ubuntu 22.04 MinGW 10 amd64 ✅️ c1b3b785d1
🌍️ Azure Ubuntu 22.04 MinGW 10 i686 ✅️ c1b3b785d1
🌍️ Azure macOS 12 AppleClang 14 amd64 ✅️ c1b3b785d1
🏡️ Home Ubuntu 24.04 GCC 13 amd64 ❌️ e331afe1f1
🏡️ Home Ubuntu 24.04 Clang 16 amd64 ❌️ e331afe1f1
🏡️ Home Ubuntu 24.04 Clang 18 amd64 ❌️ e331afe1f1
🏡️ Home Ubuntu 20.04 GCC 9 arm64 ❌️ e331afe1f1
🏡️ Home Ubuntu 20.04 GCC 9 armhf ❌️ e331afe1f1
🏡️ Home Ubuntu 24.04 MinGW 13 amd64 ❌️ e331afe1f1
🏡️ Home macOS 10 AppleClang 10 amd64 ❌️ e331afe1f1
🏡️ Home FreeBSD 13.3 Clang 17 amd64 ❌️ e331afe1f1

The only difference I could identify was that it was running on my end, and the hypothesis of my home being subject to a spell was not realistic, so I started to check the differences between the CI servers and my own system, looking at environment variables, etc.

Then I found the difference… My computer has 32 threads. Those CI systems provide no more than 4 threads (If I'm right: 2 on AppVeyor and 4 on Azure).

As soon as I do crunch -helperThreads 3 I reproduce the same checksum as the CI on my end!

illwieckz commented 4 months ago

So, it looks like all the options to make image conversion reproducible in a CI are there, I then enabled the exhaustive file check.

slipher commented 4 months ago

LGTM