Open Akira1San opened 3 years ago
See also https://github.com/godotengine/godot-proposals/issues/1197.
The current plan is to add a special kind of texture that can be streamed, but it will not be usable everywhere Texture2D is for technical reasons. This means its use will be constrained to 3D rendering (and may not be usable for texture arrays).
Due to time constraints, texture streaming will have to wait for a future 4.x release and won't be in 4.0.
I am wondering if JPEG XL wouldn't be an excellent format for texture streaming - it supports tiling and progressive streaming, which I think would be ideal for large splatmaps for open world games. Also for progressively streaming higher res textures for models as they come closer to camera etc.
It could also be used to store regular textures and allow loading them at lower resolutions to save GPU memory on underspeced machines. Right now the only way to use smaller textures is by supplying them via a package (argh) or loading full res and using MIPs (aaaarghhh) :D
Existing JPEG images can be losslessly recompressed into JXL, improving compression ratio at no quality loss as well. I am not sure about decompression speed, memory use etc but I assume they are optimized for that as a delivery format that is supposed to work on anything.
Format overview: https://jpeg.org/jpegxl/
Reference implementation (C++, BSD-clause 3 license) https://github.com/libjxl/libjxl
There's also JPEG XR that seems even more flexible, worth evaluating I guess. https://jpeg.org/jpegxr/index.html
@unfa You want to store your textures in a compressed format that GPUs can natively work with. So BCn on PC and ASTC on mobile. I don't think JPEG XL supports those.
Ah, I understand. Thanks!
More than 2 years have already passed since this proposal and no work has even been undertaken.
I would like to remind everyone that since Unity shenanigans, donations to Godot doubled in size and is now receiving more than 50k euros per month, and this is not to mention individual donors and additional funding from W4 Games.
Are you okay with the management there? 2 YEARS
I understand that now they will write that proposals are accepted for consideration if there is a demand for it, but texture streaming is a fundamental system necessary for managing the most memory hungry resources - textures, there is not much demand for this proposal because the majority simply do not even understand how important it is and how it would improve the performance of their games if they finally have the ability to unload large textures dynamically in real time.
Anyone able to work on this is free to do so 🙂, if no one is taking this on how do you propose we do it?
We can't blame the godot team on not implementing this feature in the past 2 years because they were busy creating godot 4.0. I, however, also feel like this is an essential feature for scalability in 3d games.
Clay John had a few slides in his presentation ("The future of rendering in Godot") about Asset streaming at Godot Con 2023. So this feature hasn't gone under the radar!
Anyone able to work on this is free to do so 🙂, if no one is taking this on how do you propose we do it?
Hoping for years that someone will suddenly start working on it for free? I specifically included a screenshot of their monthly donations, they have the resources to hire maintainers to solve the two-year proposals that the engine needs.
Just throwing money at a problem isn't a sustainable solution...
There are many areas that require special focus and resources, and expenses that already exist, and this project relies almost entirely on volunteer work as any open source project like this does
Just throwing money at a problem isn't a sustainable solution...
the phrase sounds nice, but in this case it would have actually worked
but in this case it would have actually worked
And many other problems as well, like compiled GDScript to obfuscate output, further platform support, physics bugs, etc., etc.
How fast would the margins be spent?
In any case all of that is off topic, let's stay on topic 🙂
and this project relies almost entirely on volunteer work as any open source project like this does
Now I have a real question: what do they even spend 50k a month on?
I'd suggest looking elsewhere for that, not on topic for this proposal, please stay on topic, you can check the foundation webpage and other sources
This off topic distraction does nothing to help this proposal get implemented
Edit: you phrased it far better than I could, leave it here as a closing remark on the discussion
EDIT: I'm sorry for another off topic comment here, I can delete or move elsewhere if need be.
Note that Godot was in pretty bad financial state before Unity has hit the fan.
Because there's simply not enough hands on board to take care of everything at once. Gathering funds for Godot and starting up W4 games, and releasing Godot 4 and preparing GDC presentations and GodotCon... all of that is a lot to manage in a project.
Sure, maybe it's possible to manage the work better - I don't know how to do that, nor I have resources to help with it - maybe you do?
Also - hiring developers is not just a "throw money at it" problem. You need to hire the right people, ensure they will be comfortable working on whatever you have for them, you need to make sure they have things to do within their expertise and you need to onboard them I to the codebase, teach the coding style, introduce to other developers... It's not like clicking on an icon in an RTS to make more people build the thing faster...
At the same time there's a lot of things being worked on, and a lot of community work to manage.
If you ask a 100 Godot users, you'll get a 100 different answers to the question "what should the Godot team focus on next?". Game engines are some among the most complex and multidisciplinary pieces of software out there - there's an insane number of moving parts, and Godot has work being done on pretty much all of them - but that can't happen all at once.
Context switching between 50 tasks every day is gonna have a paralyzing overhead for a developer too, so they need time to focus on a small number of things and finish these up before taking on more.
Be patient, and remember that anybody can contribute!
PS: Reminds me of this joke:
A manager is someone who believes 9 women can deliver a baby in 1 month
other than lack of manpower, what is the obstacle to this being implemented?
other than lack of manpower, what is the obstacle to this being implemented?
There is no obstacle other than lack of time. The people who are willing to work on this are busy with other tasks right now.
edit: the author of Wicked Engine provided a great breakdown of how he implemented texture streaming, a lot of the details should work well in Godot too https://wickedengine.net/2024/06/texture-streaming/
Is this being covered by The Forge's improvements?
Is this being covered by The Forge's improvements?
No, the collaboration is completed. Everything that could be submitted has been submitted already.
Support for texture streaming is still planned, but not for 4.3.
Phooey. Well, I'll be patient then, since I don't know Vulkan well enough to provide any assistance.
BC7 Texture Compression with Sparse Virtual Texture aka Megatexture is the way to go and is the AAA route.
Sparse Virtual Texture Introduction
In a sprawling, open-world video game, the usual practice involves loading one or more physical textures per game object into the memory, or VRAM, and binding them all before a draw call. This process results in additional overhead because it requires retrieving a large amount of data from memory to the graphics processing unit (GPU), which in turn leads to significant VRAM usage.
Virtual Textures seeks to resolve this challenge by constructing a large virtual texture memory allocation that contains data for the entire world on a disk drive. The use of pagination divides the virtual texture into small chunks called tiles (pages), loading only the essential textures into physical memory and unloading those that are not required. The CPU creates a virtual address for the virtual texture memory allocation and translates it into a physical address for the physical memory. This process involves mapping the virtual address to a corresponding physical address through a page table structure. The number of tiles (pages) allocated to the virtual memory needs to match the number of tiles (pages) required by the physical memory.
We store the texture in virtual memory and divide it into several sections known as tiles. We organize and identify these tiles, or pages, with white lines. The CPU sends the required tiles divided into blocks, indicated by a red box, along with their virtual addresses, through a page table, then translates these virtual addresses into physical addresses to fill the physical memory, or VRAM, and begins the loading process.
When starting the process, we need to map the number of tiles (pages) linearly to the number of entries in the page table. For example, if we have 64 tiles (pages), we need to map them to 64 entries in the page table for translation. We can have textures with a dimension of 64x64 per tile, which also matches BC7 compression.
CPU mechanism: Consider a scenario where the total memory address is 13680. The CPU divides this memory address into a page-aligned value and a remainder. Subsequently, the CPU divides the aligned value by the page size to obtain the index from the page table. The CPU proceeds to access the page table entry at this index for the updated aligned address, and then combines the remainder with the aligned address to yield the physical address sum.
"Acyclic Graph" has the potential to elevate the efficiency of translation.
I came across some online examples of people putting it into action, so if that would be useful, I'm here to support anyone who's willing to give it a shot.
this looks like the virtual shadow map paper, maybe too demanding for mobile related https://ktstephano.github.io/rendering/stratusgfx/svsm
this looks like the virtual shadow map paper, maybe too demanding for mobile related https://ktstephano.github.io/rendering/stratusgfx/svsm
Thanks I tought I uploaded the link.
I realized recently that Juan wrote up a technical proposal last year and it hasn't been shared widely yet. So here is the text of his proposal for reference.
Texture streaming is a strong requirement for loading large game production scenes. Opening large game scenes in Godot without this would take forever and risk running out of memory since most of the high quality content nowadays relies on this being available in game engines.
There are several ways to implement texture streaming. Vulkan supports the sparse textures extension, but it is known to not be well performant on PC.
The most common and straightforward way to implement texture streaming is with a persistent pool with various texture array sizes. As an example, a pool could exists as this collection of texture arrays, compressed as either DXT5 or BC7 (depending on settings):
Texture Arrays:
This is basically a pool of texture data that is around 1.5gb in size (meaning that this is compatible with most GPUs and iGPUs nowadays).
The general idea with streaming is that in practice, when rendering a large scene, most of the textures are not read at their larger mipmap resolution. In fact, most are only rendered at the smallest ones with only the ones very close to the camera using the full resolution.
As such, the main idea is that when a texture that will be used in streaming is loaded, only their smallest resolution version will be loaded (as per the example above, 128x128). Then, the idea is to detect every frame which size would be required to render optimally each texture. If the optimal mipmap size is bigger than the current size loaded, then a larger version may need to be loaded.
To load a larger version, often one must determine if another of the textures used at bigger sizes is a candidate from being downgraded to a smaller size to make room for this new one.
This is done by checking:
Godot, hence, should support a special texture type and shared type called StreamedTexture2D. This texture type is, unfortunately, not compatible with Texture2D. As much as I would like this to happen I don't think there really is any way to reconcile this.
StreamedTexture2D is a special resource type, textures should be imported as this special type and they will always (depending on the pool setting) be compressed to either DXT5/BC7 (desktop) or ETC2A/ASTC (mobile).
The internal file format can be the same as the one in CompressedTexture2D or very similar and, of course, the import process is a simplification of it.
The shader compiler needs to add this new texture type, sampler2DStreamed. Example usage:
sampler2DStreamed albedo_tex : hint_albedo;
void fragment() {
ALBEDO = texture( albedo_tex, UV );
}
There is, however, a strong underlying difference between how the code is generated here and how the code is generated for regular textures, as this puts extra logic.
Under the hood, this would work somehow like this:
// On the global GLSL shader scope
uniform texture2DArray texture_streaming_pool[MAX_TEXTURE_SIZES];
buffer TextureSlots {
ivec2 slots[];
} texture_streaming_slots;
// On the material, instead of storing a texture2D, just an uint is stored with an uint
// When the texture is actually read:
ALBEDO = texture( albedo_tex, UV );
// becomes
albedo = texture( sampler2D( texture_streaming_pool[ texture_streaming_slots[ material.albedo_tex ].x ] , sampler), vec3( UV, float(texture_streaming_slots[ material.albedo_tex ].y) );
This ensures that the texture is actually read from the pool properly, using the right size and index. The fact that an indirection is used via the texture_streaming_slots variable means that this texture can be moved between different sizes on the fly on demand. If the camera gets closer to a required higher mipmap, then the CPU can load it and move the texture without affecting any of the compiled materials.
The streaming logic requires that, each time the texture is read, a buffer with the maximum mipmap the texture used is updated, then this buffer is sent back to the CPU for analysis.
This can be implemented like this:
// On the global scope
// cleared to 0xFFFFFFFF (meaning, unwritten) every frame
buffer StreamingTextureMipmapUsed
{
uint value[];
} streaming_texture_mipmap_used;
// When the texture is actually read:
ALBEDO = texture( albedo, UV );
// Becomes
// on the statement _before_ reading, this code is added
{ // Inserted statement
uint mipmap = uint(textureQueryLod(texture_max_size).y,UV);
atomicMin( streaming_texture_mipmap_used.value[ material.albedo_tex ], mipmap);
}
// Then regular read
albedo = texture( sampler2D( texture_streaming[ texture_streaming_slots[ material.albedo_tex ].x ] , sampler), vec3( UV, float(texture_streaming_slots[ material.albedo_tex ].y) );
This ensures that when read, you can send the streaming_texture_mipmap_used back to the CPU for analysis. Of course, as-is this code would be highly inefficient because it would put an enormous memory pressure on that buffer. To offload this pressure a couple of different things can be done:
Because of the algorithm, several restrictions need to be imposed to the usage of StreamTexture2D
It can't be used inside a for/while loop (this affects performance). Or, if it is, some bool value needs to be added to ensure that only the first read determines the size. It can't be used anywhere else than a spatial shader fragment function.
One of the main problems with this approach is that it requires separating between Texture2D and StreamedTexture2D, which are entirely incompatible (Can't use one in place of the other) for the reasons described before.
Artists would need to reimport their textures as StreamingTexture2D if they pretend to use them like this, which is kind of a hassle, but unavoidable. At least with this PR, large part of the hassle is removed.
These are some quality of life improvements that would be added:
Why are the vertex shaders barred from using these?
One of the main problems with this approach is that it requires separating between Texture2D and StreamedTexture2D, which are entirely incompatible (Can't use one in place of the other) for the reasons described before.
Has anyone looked into how other engines does this, UX-wise? Unity has a simple per-texture toggle that doesn't require you to change a texture's import type. Just enable it along with the global settings for mipmap streaming and bam, it just works (supposedly). I think this might be a better workflow.
@atirut-w I think this is because the Texture2D API is designed with the assumption that the texture is just an atomic preloaded resource, not a complex streamable object. It affects a big part of its API
Why are the vertex shaders barred from using these?
You can't automatically calculate an LOD from a vertex shader.
Has anyone looked into how other engines does this, UX-wise? Unity has a simple per-texture toggle that doesn't require you to change a texture's import type. Just enable it along with the global settings for mipmap streaming and bam, it just works (supposedly). I think this might be a better workflow.
In the doc you linked it says that for custom shaders you have to request the mip level manually. Plus you have to enable it in the import settings, just like what is suggested here.
So, just for clarity, the workflow in Unity is:
The proposed workflow in Godot above is:
sampler2DStreamed
So, just for clarity, the workflow in Unity is:
- Enable streaming in the project settings
- Enable streaming on the texture
- Re-import the texture
- Streaming works automatically in render component texture, but only with UV0
- All other cases, you need to specify the mip level manually in the CPU
The proposed workflow in Godot above is:
- Enable streaming on the texture
- Re-import the texture
- Streaming works automatically in the Standardmaterial3D
- Streaming works automatically in custom shaders if you use
sampler2DStreamed
The proposed Godot workflow specifically called for changing the import type, which is different than a simple toggle. This also introduces some questions:
To add to the proposal, there should also be a setting to override streaming memory budget. By default, the engine would fully utilize all VRAM, but you would be able to set a custom limit.
- Are the old import settings preserved with switching to StreamedTextured2D?
Not currently, but this could be implemented in the editor by preserving properties that have the same name and type when switching import types.
Describe the project you are working on
A horror game with a little bit of open world.
Describe the problem or limitation you are having in your project
Not really a problem, but optimization is the key in here.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
Texture streaming is a very common feature in 3D engines today, and for good reasons. Texture streaming can provide some clear advantages for games which have massive amounts of textures to deal with. The two most well-known advantages of streaming are:
A streaming system can automatically keep only the necessary textures in memory to minimize the minimum VRAM requirements of any given scene, while still being able to utilize any left over VRAM as a cache by only unloading textures when necessary. Reduced load times since the game can be played despite that the textures are still being loaded in.
(text description taken from https://jvm-gaming.org/t/tutorial-stutter-free-texture-streaming-with-lwjgl/47661)
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Im guessing that the user will change the texture properties from its setting to an texture stream and its size
If this enhancement will not be used often, can it be worked around with a few lines of script?
It will be used like 60-90% of game projects.
Is there a reason why this should be core and not an add-on in the asset library?
Its Core!