Open cansik opened 2 years ago
The current GLB format has several well-known shortcomings and the 4 GiB size limitation is one of them. This will likely be addressed in the future container format revisions, be it GLBv3 or something completely new.
/cc @emackey @javagl
@cansik Is there a benefit to storing > 4GB in a single file? It's a heavy lift in that form.
One possibility is to use *.gltf (not GLB) with multiple buffers, such that each buffer can be a more readable size.
Some systems such as 3D Tiles break up their dataset into a hierarchy, with lots of self-contained GLBs storing sections of the data, along with a folder system that makes it easy to find and download-on-demand a couple of small GLBs representing what the user is looking at in the moment.
The limitation itself might be a tribute to glTFs origin as a transmission format: A >4GB file will likely not be transferred to the client as a single, huge blob. So I'd hesitate to say that glTF is the right "layer" to address this, but might be convinced otherwise. In any case, this cannot easily be changed in a backward-compatible way - except for defining the GLB version
to be 3
, and store the length somewhere else.
A container format could alleviate the problem. And as long as there is no dedicated container format above glTF, there could even be very use-case specific solutions. A "mesh sequence" could be many things (and it sounds like something of which clients will usually not know what to do with it, unless there is an extension for this), but maybe some JSON file with { meshSequence: [ "0.glb", "1.glb"...] }
could be a first shot...
An aside:
One possibility is to use *.gltf (not GLB)
I also prefer the idea of a GLB being "self-contained", but I think it is not disallowed by the spec for a GLB to refer to external resources (and cannot remember that it has been explicitly discouraged anywhere - maybe somewhere hidden in https://github.com/KhronosGroup/glTF/issues/1117 ...?).
I also prefer the idea of a GLB being "self-contained", but ... cannot remember that it has been explicitly discouraged anywhere
I'd summarize #1117 as: _public user-facing tools should try to provide .gltf
with external resources AND/OR .glb
with embedded resources_. Other arrangements — .gltf
with embedded resources or .glb
with external resources — are allowed and valid where appropriate for a project, but I think it's better for most users if these don't proliferate in the 3D ecosystem.
FWIW, I asked about self-contained glb a while back and there is a valid reason for using external resources with a glb for servers.
Question: https://github.com/KhronosGroup/glTF/issues/828#issuecomment-277314076 Response: https://github.com/KhronosGroup/glTF/issues/828#issuecomment-277474901
As a workaround, I would suggest using zip as a plain container, and maybe the extension .GLZ could be used to distinguish it from GLB.
This would allow limitless file sizes, and clients and tools would easily adopt it. Worst case scenario could be resolved by unzipping to a directory
The recommendation would be, to keep using GLB for transmission scenarios and GLZ for large files, backup and intermediate workflows
A plain ZIP (as an archive) could have some issues because it does not allow random access. There are approaches for extending ZIP files with sorts of "indices" (basically: A file that is always stored as the last entry in the ZIP, and stores a mapping of "file name" to "byte offset in the ZIP"), but no real 'standard' for that, as far as I know.
it does not allow random access
That's not completely true. the USDZ format, which is a glTF competitor, uses a ZIP file with the restriction of forbidding file compression, which results in a plain file with a TOC and randomly accessible files.
Krita .KRA files, and OpenRaster .ORA use a similar approach: the entries in the ZIP must be stored uncompressed to allow random access. So to some degree, we could say that uncompressed ZIPs are becoming a thing.
But it all depends on how are you going to consume the files. My SharpGLTF library already supports zipped glTFs and I don't see any problem handling compressed zips)
a ZIP file with the restriction of forbidding file compression, which results in a plain file with a TOC and randomly accessible files.
Yes, this could be an option. There's still the small caveat that on the consuming side, people will usually have to invest some effort there: I think that most "ZIP libraries" (as 'common libraries for zip handling in different programming languages') tend to hide that as an abstraction, and only offer functionalities like "iterating over all entries", or "looking up a certain entry (with a linear search under the hood)".
More technically/specifially: These libraries do not necessarily provide the low-level access to the ZIP central directory that would allow constant-time lookups.
(All this does not prevent this approach, but should be kept in mind when making this an integral part of a specification)
ZIP file format has a table of contents with offsets and sizes that is read before accessing the rest of the content. I know a few ZIP libraries and all of them give you random access to the contents, even when compressed.
In fact, random access is available in most archive formats, the only ones not supporting random access are Tar.GZ and Rar/7z when compressed as solid archives.
In addition to my proposal of a zipped version, I had the opportunity to tinker with OpenRaster and some other Zip based documents recently, so, I wanted to ask:
Would it be fine to open a new issue with a proposal for GLZ ? or it's better to keep discussion here?
The GLB file size is currently limited to
2^32
bytes (approximately4 GB
) which could be limiting for some applications. For example we would like to store a mesh sequence into the GLTF format, which quickly exceed the2^32
byte limit.It is understandable to limit the individual chunk size to
2^32
, but limiting the total file size to2^32
seems to restrict future applications. Is there a technical limitation to only support datatypes with size of anunsigned int
? Or is there a workaround for this limitation that integrates everything into a single file?Here the current binary protocol definition of a GLB file: Source: Binary glTF Layout