Open hhyyrylainen opened 2 weeks ago
As noted in #97360 (which turned out to be a duplicate of this) the non-determinism stems from this piece of code:
Which is non-deterministic because of the fact that Resource::generate_scene_unique_id
is generated based on timestamps, as seen here:
From what I understand after chatting with @reduz about this very briefly, this ID shouldn't be generated for stuff that ends up in the .godot
folder in the first place, since their IDs are found in the *.import
files anyway.
Tested versions
System information
Godot v4.3.stable.mono - Fedora Linux 40 (Workstation Edition) - Wayland - Vulkan (Forward+) - dedicated AMD Radeon RX 7900 XTX (RADV NAVI31) - AMD Ryzen 9 5950X 16-Core Processor (32 Threads)
Issue description
When importing a specific kind of file into Godot, the imported data (in the .godot folder) results in a hash that changes each time the file is imported. So even when the original file is not changed at all, Godot writes a generated file based on that that has a different hash. This is not very optimal as this increases delta compressed build sizes (for example Steam game updates). And unless this is intentional for these file types, this indicates that Godot writes some garbage / uninitialized bytes into the import files, or there is some other factor that results in (slight) randomization of the file contents.
This seems to happen to all .ttf and .ogg files in my project and some .glb files. In the case of .glb this might depend on the contents of the file as I'm not entirely confident I saw all of our .glb files changing hashes of the imported data. This has the effect that each CI build of my game now takes up multiple times more storage than a few months ago when I was still using Godot 3. That is how I noticed this problem and started to investigate.
I have hundreds of asset files in total where most of them don't experience the same issue, it's just these 3 file types. And for example .png files when imported do not experience the same issue but always import as exactly the same hash.
Here's an example comparing one imported .ogg file to itself one import cycle later. It seems that only 9 bytes out of 12 KB are different (here's them shown with the offset into the file and then the differing bytes):
So the differences between the files are very minor but result in different hashes meaning that file identification based on hashes doesn't work and delta compression is less effective as there are a few changed bytes here and there invalidating otherwise the exact same blocks of data.
Steps to reproduce
.godot
folder. I've provided a script that does this:./check_import_hashes.rb
.godot
folder (or maybe just the imported folder under it is enough).godot
folder one last time. The hashes are now different even though none of the input asset files were changedHere's an example output of running that script again at the last step:
editor_layout.cfg
anduid_cache.bin
changing seem fine to me, but those asset files I'd expect to be consistently imported as the same hash.For good measure doing the same steps again results in yet more hashes for all of the problematic files:
Minimal reproduction project (MRP)
InconsistentImport.zip