Open bend-n opened 1 year ago
there is a zipreader - why cant there by a tarreader? also making a gdextension for something so small seems like overkill.
Neither of these is a good argument to add something to the core, especially something relying on a 3rd party dependency. You can indeed use the OS execute flow, or implement a native module if you need more direct integration. There is no need for this in the engine to justify this, as far as I can see.
The OS execute flow requires that tar
be on the system, usable, and accepting the same arguments. It is not reliable.
As for the second part - theres nothing that i can say, except its much easier to use a core module than a native module.
The OS execute flow requires that
tar
be on the system, usable, and accepting the same arguments. It is not reliable.
That's even more reason not to include this in core: it's not guaranteed to be present in the system, so we must bundle it with the engine. If there is not reason for it in the engine, and there is no compelling argument for it in many projects made with the engine, then it's just bloat aimed to solve an issue in your specific project. That's where plugins, modules, extensions, and custom engine builds should be used.
As for the second part - theres nothing that i can say, except its much easier to use a core module than a native module.
The ZIP module is included because the engine itself uses it. If it wasn't the case, it would be unlikely to be included (though not impossible). We don't add stuff to the core just because it would be nice to have. That's strictly against the principles on which the project is improved.
Indeed, there are a lot of archive formats, it makes no sense to add them all to Godot, since Godot is not an archiver program. zip is one of the most popular and widespread formats, which is supported by most archivers, operating systems, and services.
tar
is preinstalled on most Linux distributions, macOS and on Windows 10 or later. You can OS.execute()
it with usually the same CLI arguments, at least for simple invocations (Linux uses GNU tar, macOS and Windows use BSD tar).
The tar-reader does not interact with the engine, so maybe just include a compiled tar-reader beside the exported executable.
If you need to use the reader at multiple times in the code, a wrapper script would abstract the underlaying implementation
# not tested, just as a concept
class_name TARReader
extends RefCounted # if we want to mirror the behavior of FileAccess
var path: String = ""
var temporary: bool = false
func open(file_path: String, out_dir: String = "") -> Error # maybe '-> int' is correct
var err := FileAccess.exist(file_path)
if err != OK:
return err
if out_dir.is_empty():
out_dir= "user://" + tar_path + "/" # I hope this is a valid directory
temporary = true
err = OS.execute(
"tar",
[
"xzf",
ProjectSettings.globalize_path(file_path),
"--directory",
ProjectSettings.globalize_path(out_dir),
],
output,
)
if err != OK:
return err
path = out_path
return OK
func get_files() -> PackedStringArray:
var dir_access = DirAccess.open(path)
return dir_access.get_files()
func _notification(what: int) -> void:
match what:
NOTIFICATION_PREDELETE:
if temporary:
DirAccess.remove_absolute(path)
To make this work in godot core, we'd need to find a library that implements tar and add it to godot. The library implementing tar should be small and MIT-license compatible.
TARReader cannot be an external executable to be mergable in core.
I haven't researched but not sure the difficult of the tar format to implement since it was made decades ago and it doesn't do compression. It probably is limited in code complexity.
Your team might be able to create a native C++ module, but the problem is that as far as I can tell you want this to work on plain Godot Engine releases which won't be compatible.
Using a c++ gdextension creates a chicken and egg problem where you need a loader to load the loader.
@ifire Im not sure what you mean about the loader to load the loader, does gdextension require a non builtin loader?
You'll need to fetch the gdextension via an asset library or via a zip http download. You can't store it as a .tar.gz
@fire you cant just include the gdextension in the plugin? I see. tis quite the conundrum then.
The problem with .tgz as a format is it's compressed entirely, not per-file, so you cannot random access files. I'm a little unsure how the API deals with the decompression part of TARReader
In case you're interested, I also have a pure-GDScript implementation here for .unitypackage import since those are .tar.gz internally https://github.com/V-Sekai/unidot_importer/blob/main/tarfile.gd
Unfortunately Godot does not currently have streamable decompression APIs, so I decompress to memory for small files, but large files are forced to use the tar
command. This code is here: https://github.com/V-Sekai/unidot_importer/blob/main/unitypackagefile.gd#L47
(Note the whole situation with Windows and needing a --force-local
due to archaic tar command behavior and Windows paths containing colons.)
@lyuma it would seem that more people than one would benefit from the TARReader
, then?
Quite nice though, will look into it.
Possibly, but it's very easy to write your own.
The question that I would need an answer to is how the implementation of TARReader would actually work.
for f in reader.get_files():
var contents = reader.open(f)
If you did this, it could be very inefficient for compressed tar files.
The only options I would see that make sense are
reader.open(f)
, it just gives all files one by one, in whatever order they are on disk.microtar has the open()
function...
Although it doesnt have a list files function...
I like your second idea though, so would it be like,
var tar = TARReader.new()
for f in tar.open("res://file.tar"):
print("contents of %s: %s" % f)
?
To be clear, this isn't a question of API, but a question of how the compressed format itself works.
The entire tar archive is compressed all as a single blob, so accessing a file at the end requires reading through and decompressing the whole file again, for every file you open.
microtar doesn't support compressed tar files, so they don't have to worry about this. This issue is specifically a problem with the compressed tar files (.tgz)
tgz
is a synonym for .tar.gz
, which is just a .tar
file compressed with GZip. Godot already contains native support for GZip decompression, through PackedByteArray.decompress
with compression_mode
set to COMPRESSION_GZIP
.
That means you only need to extract the uncompressed tar
file. This is actually trivial, since files inside a tar
are stored flat, without so much as being divided into packets, so it's simply a matter of reading the header to get the file's size (and, hence, location of the next file), then using PackedByteArray.slice
So, TL;DR it's possible to do this in a few lines of GDScript without using any external programs. There's absolutely no reason to add any functionality in the core for this.
It would be trivial to add a core TARReader
, then, if its so trivial to do in GDScript
, which would garner much more performance?
It would be trivial to add a core
TARReader
, then, if its so trivial to do inGDScript
, which would garner much more performance?
The decompression (where most CPU time is spent) is performed in C++, while the glue code reading the header isn't performance-critical.
@SlugFiller Could you send a working example of this gdscript tarreader?
@bend-n Godot 3 version:
func untargz(source_file: String) -> Array:
# Read file into a buffer
var file : File = File.new()
if file.open(source_file, File.READ) != OK:
return null
var buffer : PoolByteArray = file.get_buffer(file.get_len())
file.close()
# UnGZip
buffer = buffer.decompress_dynamic(-1, PoolByteArray.COMPRESSION_GZIP)
# Untar
var ret : Array
var offset : int
while offset+512 < buffer.size():
# Get filesize from header
var filesize : int = _octal_number_get(buffer.slice(offset+124))
if filesize < 0 || offset+512+filesize > buffer.size():
break
if filesize == 0:
# Check for two empty records marking the end of the archive
# This is a relatively expensive operation (for GDScript),
# but will generally happen only once in the whole archive
if _check_all_zeros(buffer.slice(offset, offset+512)):
if offset+1024 > buffer.size() || !_check_all_zeros(buffer.slice(offset+512, offset+1024)):
offset += 512
continue
break
var entry : FileEntry = FileEntry.new()
# Get filename from header
entry.name = _trim_null(buffer.slice(offset, offset+100)).get_string_from_utf8()
# File contents come immediately after the header
entry.contents = buffer.slice(offset+512, offset+512+filesize)
ret.push_back(entry)
# Round to the nearest 512 byte block
offset += 512+filesize+(511-((filesize+511) & 0x1FF))
return ret
func _octal_number_get(input: PoolByteArray) -> int:
var output : int = 0
for offset in input.size():
if input[offset] == 0:
return output
if input[offset] < 48 || input[offset] > 55:
return -1
output *= 8
output += input[offset]-48
return output
func _trim_null(input: PoolByteArray) -> PoolByteArray:
var end : int = input.size()
while end > 0 && input[end-1] == 0:
end -= 1
return input.slice(0, end)
func _check_all_zeros(input: PoolByteArray) -> bool:
for offset in input.size():
if input[offset] != 0:
return false
return true
class FileEntry:
var name : String
var contents : PoolByteArray
Godot 4 version:
func untargz(source_file: String) -> Array[FileEntry]:
# Read file into a buffer
var file : FileAccess = FileAccess.open(source_file, FileAccess.READ)
if file == null:
return null
var buffer : PackedByteArray = file.get_buffer(file.get_length())
file = null
# UnGZip
buffer = buffer.decompress_dynamic(-1, PackedByteArray.COMPRESSION_GZIP)
var ret : Array[FileEntry]
var offset : int
while offset+512 < buffer.size():
# Get filesize from header
var filesize : int = _octal_number_get(buffer.slice(offset+124))
if filesize < 0 || offset+512+filesize > buffer.size():
break
if filesize == 0:
# Check for two empty records marking the end of the archive
# This is a relatively expensive operation (for GDScript),
# but will generally happen only once in the whole archive
if _check_all_zeros(buffer.slice(offset, offset+512)):
if offset+1024 > buffer.size() || !_check_all_zeros(buffer.slice(offset+512, offset+1024)):
offset += 512
continue
break
var entry : FileEntry = FileEntry.new()
# Get filename from header
entry.name = _trim_null(buffer.slice(offset, offset+100)).get_string_from_utf8()
# File contents come immediately after the header
entry.contents = buffer.slice(offset+512, offset+512+filesize)
ret.push_back(entry)
# Round to the nearest 512 byte block
offset += 512+filesize+(511-((filesize+511) & 0x1FF))
return ret
func _octal_number_get(input: PackedByteArray) -> int:
var output : int = 0
for offset in input.size():
if input[offset] == 0:
return output
if input[offset] < 48 || input[offset] > 55:
return -1
output *= 8
output += input[offset]-48
return output
func _trim_null(input: PackedByteArray) -> PackedByteArray:
var end : int = input.size()
while end > 0 && input[end-1] == 0:
end -= 1
return input.slice(0, end)
func _check_all_zeros(input: PackedByteArray) -> bool:
for offset in input.size():
if input[offset] != 0:
return false
return true
class FileEntry:
var name : String
var contents : PackedByteArray
Describe the project you are working on
A package manager that works via npm
Describe the problem or limitation you are having in your project
Cant extract tarballs natively
Describe the feature / enhancement and how it helps to overcome the problem or limitation
TARReader
as a sibling ofZIPReader
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
If this enhancement will not be used often, can it be worked around with a few lines of script?
sort of
Is there a reason why this should be core and not an add-on in the asset library?
there is a zipreader - why cant there by a tarreader? also making a gdextension for something so small seems like overkill.