vn-tools / arc_unpacker

CLI tool for extracting images and sounds from visual novels.
GNU General Public License v3.0
561 stars 83 forks source link

nscripter/nsa: Unknown compression type (4) #152

Open NotKyon opened 6 years ago

NotKyon commented 6 years ago

Game info

VNDB link: https://vndb.org/r41280 Release date: 2016-07-08 Original title: Umineko no Naku Koro ni

Details

Running the following command is unable to extract most files with an "Unknown compression type" error.

./arc_unpacker --verbosity=3 --dec=nscripter/nsa --out=[...] [SteamLibrary]/steamapps/common/Umineko/arc.nsa

I modified src/dec/nscripter/nsa_archive_decoder.cc to also spew the compression type it's getting (as an integer). Here's a partial dump of the output.

image

Here's a gist of the full log.

https://gist.github.com/NotKyon/f98a129f495ca4b8d21512c6766be1a4

This is the function I've modified in src/dec/nscripter/nsa_archive_decoder.cc:

std::unique_ptr<io::File> NsaArchiveDecoder::read_file_impl(
    const Logger &logger,
    io::File &input_file,
    const dec::ArchiveMeta &m,
    const dec::ArchiveEntry &e) const
{
    NsaEncryptedStream input_stream(input_file.stream, key);

    /* ... snip ... */

    if (entry->compression_type == CompressionType::Spb)
    {
        const auto decoder = SpbImageDecoder();
        const auto encoder = enc::png::PngImageEncoder();
        io::File spb_file("dummy.bmp", data);
        return encoder.encode(
            logger, decoder.decode(logger, spb_file), entry->path);
    }

    /* These lines were modified to also print the error. This is not production code, obviously. */
    char buf[256];
    sprintf(buf, "Unknown compression type: %i", int(entry->compression_type));

    throw err::NotSupportedError(buf);
}
NotKyon commented 6 years ago

I looked into the ponscripter source code (in sekaiproject/ponscripter-fork specifically). It seems that "compression type 4" is what they call NBZ. Which is an integer indicating the original length of the file, followed by a BZip2 compressed stream of data.

ponscripter/src/BaseReader.h shows the compression type listing.

ponscripter/src/DirectReader.cpp, line 394, shows their implementation of the read operation.

Here's the relevant code within the function:

fseek(fp, offset, SEEK_SET);
original_length = count = readLong(fp);

bfp = BZ2_bzReadOpen(&err, fp, 0, 0, NULL, 0);
if (bfp == NULL || err != BZ_OK) return 0;

while (err == BZ_OK && count > 0) {
    if (count >= READ_LENGTH)
        len = BZ2_bzRead(&err, bfp, buf, READ_LENGTH);
    else
        len = BZ2_bzRead(&err, bfp, buf, count);

    count -= len;
    buf += len;
}

Haven't confirmed whether this is truly the fix, but it seems to match pretty well with what it likely would be.