godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.56k stars 20.72k forks source link

Open compressed files that were not made by godot #28999

Open xelivous opened 5 years ago

xelivous commented 5 years ago

Godot version: 33897d9b5844aa0147d55841845427ed599d069f

OS/device including version: Linux 4.14.113-1-MANJARO

Issue description:

I was trying to load a gzipped json file from an external web server, and found out that I can't seem to find a way to do so in godot without making a gdnative plugin. File has the method open_compressed() which supports gzip, but the format it expects for gzip has this bizarre gcpf magic header that encapsulates the format. Screenshot_2019-05-19_05-54-52

I kind of expected that it would at the very least open up compressed files that weren't made using the godot editor, assuming encryption wasn't used, considering at the end of the day it's just a bunch of strings or whatever other bytes in there. I'm not sure why the magic header exists in the compression format, but ideally if it could fall back to not requiring the magic header at all and just open up the file without it instead that would be cool.

If I ever need to edit a gzipped file that godot has made i can't just do a quick change from any other archive tools, i'll have to open up godot, open it up, edit it, and then re-export it in order for godot to continue being able to read it which is kind of annoying.

Steps to reproduce:

  1. download a gzip'd file from anywhere or make one yourself using the command line
  2. try and load it into godot using open_compressed or any other manner

Minimal reproduction project:

test_compression.zip

slasktotten commented 5 years ago

Has there been any update on this?

Skaruts commented 5 years ago

I was trying to load RexPaint images (.xp files) and it seems I'm having the same problem. Error 15, file unrecognized.

I'm an ignorant when comes to compression formats, but according to RexPaint's manual:

Appendix B: .xp Format Specification (and Import Libraries)
-----------------------------------------------------------------
    (...)
    The .xp files are deflated with zlib (specifically they are gzipped files created 
    via gzofstream); once decompressed the format is binary (...)

So I tried File.COMPRESSION_GZIP first, but then I also tried all the others and none worked. If I manually extract the file with 7z then it works fine, but I'd rather not have to be doing that...

DrMoriarty commented 4 years ago

I also have the same problem. I try to pack some text resources and use it in game. I tried zstd and gzip. I can not open this compressed files in gd script because open_compressed returns error=15.

R-033 commented 4 years ago

To avoid error 15 you need to add header yourself, sadly As I understand, it goes as follows:

file.store_32(0x46504347) file.store_32(0x00000001) # compression method file.store_32(0x00001000) file.store_32(decompressedSize) file.store_32(compressedSize)

And in the end of file for safety:

file.store_32(0x00000000) file.store_32(0x46504347)

I pack compressed data + header into temporary file with open() and then read that file with open_compressed()

Skaruts commented 4 years ago

@R-033 where does the compressedSize and decompressedSize come from?

R-033 commented 4 years ago

@Skaruts compressedSize is the size of data between header and footer in bytes (original file size), decompressedSize is expected size of data after it's being decompressed by open_compressed() method

Skaruts commented 4 years ago

@R-033 I'm confused, though. At that point I don't know the size of the decompressed data, or what to expect the size to be.

R-033 commented 4 years ago

@Skaruts usually this value is used for creating output array internally, so maybe if you'll make it big enough there won't be a problem? I'm just guessing though, I didn't check godot sources

facespkz commented 3 years ago

Ran into this today. Especially annoying since the compression formats already seem to use magic numbers...

Calinou commented 3 years ago

I was trying to load a gzipped json file from an external web server

Note that for this particular use case, transparent gzip compression is now supported in HTTPRequest in the master branch: https://github.com/godotengine/godot/pull/38944

jamie-pate commented 2 years ago

~NOTE: this is also weird: every store_string() call will create a new gzip header! so you should pre-buffer your content before writing to a compressed file or you will bloat it with extra gzip block headers?~

It looks like there is a maximum block size for compressed data, and the FileAccessCompressed class will write multiple gzip blocks with magic etc inside your compressed file.

paulmiller commented 1 year ago

I wrote a little program to compress files outside of Godot, which can then be loaded normally inside Godot: https://gist.github.com/paulmiller/a5e593eda3a14e3ffa9acd8f0a4fac4e

It also gets way better compression ratios for some files, with the downside that Godot will decompress the entire file at once.

DanielKilgallon commented 9 months ago

I had to do some trial and error, but I was able to use the advice in this thread to successfully get Godot to read a file that was externally compressed. https://gist.github.com/DanielKilgallon/5936bd6b5020202ce5dc61c0295ee10f

Zylann commented 8 months ago

I tried opening NBT files (in my case Minecraft schematics) which are documented as being some binary data wrapped in GZIP. Unfortunately I hit this issue too, Godot can't recognize the file.

7zip can open the file just fine (also indicates there is a 10 bytes header), so there is surely a way to recognize its format.

So far I just see that FileAccess.open_compressed actually expects a custom, non-standard Godot format in any case, wrapping the actual format specified in compression_mode. So it looks like it would be quite annoying to make it support standard GZIP directly, but I'm not familiar with this code so I dont know if there is a trick to do that cleanly.

After fiddling around, in the end it looks like this worked:

    compressed_data = FileAccess.get_file_as_bytes(fpath)
    var decompressed_data := compressed_data.decompress_dynamic(-1, FileAccess.COMPRESSION_GZIP)

Of course if the file comes from malicious players you should specify max_size as even if the file is tiny, the uncompressed size can be 4Gb.


The following is some fiddling around I did, before I noticed GZIP was just working (I was trying hard with DEFLATE instead but got nowhere).

I gave a try reading the GZIP format myself, and handling the compressed data to PackedByteArray.decompress_dynamic. I decoded 10 bytes of header (like I saw in 7zip) and decoded the uncompressed size in the footer (which matched what 7zip tells). CRC also matches what 7zip tells (though 7zip shows it in hexadecimal, careful :p). But for some reason decompression keeps failing with the "incorrect header check" warning, I'm not sure why.

Here is the code I have so far:

static func open_gzip(fpath: String) -> PackedByteArray:
    var f := FileAccess.open(fpath, FileAccess.READ)
    if f == null:
        push_error("Could not open file ", fpath, ", error ", FileAccess.get_open_error())
        return PackedByteArray()

    # https://docs.fileformat.com/compression/gz/
    # Read 10-byte header
    var header := f.get_16() # 1f 8b
    var compression_method := f.get_8() # 08 for DEFLATE
    var file_flags := f.get_8()
    var timestamp := f.get_32()
    var compression_flags := f.get_8()
    var os_id := f.get_8()
    # Assuming no other extra header stuff, which my file doesn't have, 
    # but might need to be handled eventually

    var compressed_data_position := f.get_position()
    print("compressed_data_position ", compressed_data_position)
    var total_file_length := f.get_length()
    var footer_length := 8
    var compressed_data_size := total_file_length - compressed_data_position - footer_length
    var compressed_data := f.get_buffer(compressed_data_size)

    # Read footer
    var checksum_crc32 := f.get_32()
    var decompressed_data_size := f.get_32()
    print(f.get_position())
    print("decompressed_data_size ", decompressed_data_size)
    print("CRC32 ", checksum_crc32)

    f = null

    #compressed_data = FileAccess.get_file_as_bytes(fpath)

    # Decompress data ourselves. Usually the format is DEFLATE.
    # Godot allows to specify either GZIP or DEFLATE, but there is no difference in the 
    # implementation.
    var decompressed_data := compressed_data.decompress_dynamic(-1, FileAccess.COMPRESSION_DEFLATE)
    print("decompressed_data.size(): ", decompressed_data.size())

    return decompressed_data

And the file I'm testing with, just in case (inside zip) train_bridge_x2.zip