WolfireGames / overgrowth

Open Source codebase of the game Overgrowth by Wolfire Games LLC
Apache License 2.0
2.51k stars 260 forks source link

Error reading ZIP-file while building the code #78

Closed 4500zenja1 closed 1 year ago

4500zenja1 commented 2 years ago

I am trying to build the code for Overgrowth with Debug|64x configuration. Earlier, before main menu, I got the error about missing icon-online.png file, but that issue can be neglected.

When I'm trying to start the first level, "White Flags", or continue it, after minutes of getting data I receive an error saying the program couldn't read data from ZIP-file. The screenshot with the error and traceback are below:

image

I suppose something fishy is going in the function reading the ZIP-file. The ZIP is present in the folder. Are there any options to resolve the issue besides the manual unzipping or smth?

perry-blueberry commented 2 years ago

Getting the same failure as well as continuing from the missing icon. Did you manage to find a fix?

perry-blueberry commented 2 years ago

I switched to internal_testing on Steam and that particular issue was resolved. Now it's showing other similar issues, but also even just playing from Steam is not working. I take it internal_testing is actually not in a working state at the moment? Is there any way to get this to work on Windows with Steam?

ipro4911 commented 2 years ago

Just check your steam cache files

redruin1 commented 2 years ago

Similar scenario, working on Windows. More info that I've uncovered: Loading White Flags fails on both Debug/64 and Release/64, somewhere during or after converting textures in converttexture.cpp, regardless of whether or not the game was started with a blank cache:

[i][__]:converttexture.cpp:  50: Converting D:/Steam/steamapps/common/Overgrowth/Data/Textures/Thumbnails/og_story/smallest_12_canyon_ambush.jpg to C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/Textures/Thumbnails/og_story/smallest_12_canyon_ambush.jpg_converted.dds by way of C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/Temp/conversion0.dds
[i][__]:converttexture.cpp: 120: Conversion completed
[i][__]:converttexture.cpp:  50: Converting D:/Steam/steamapps/common/Overgrowth/Data/Textures/Thumbnails/og_story/smallest_13_rat_slavers.jpg to C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/Textures/Thumbnails/og_story/smallest_13_rat_slavers.jpg_converted.dds by way of C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/Temp/conversion0.dds
[i][__]:converttexture.cpp: 120: Conversion completed
[i][__]:   zip_util.cpp: 412: 0
[i][__]:   zip_util.cpp: 413: 20859790
[i][__]:      error.cpp: 182: Displaying message: Error, Error reading from UnZip file
[i][__]:      error.cpp: 183: No stacktrace available on this platform

Sandbox levels like Red Shards can be loaded on either debug or release, but only with a blank cache; attempting to restart and load the same level, or even just quitting to the main menu and reloading results in another failure, this time with the nav-mesh:

[i][__]:    navmesh.cpp: 678: Found:C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/LevelNavmeshes/Project60/4_red_hills.nav.xml
[i][__]:    navmesh.cpp: 685: Found:C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/LevelNavmeshes/Project60/4_red_hills.nav.obj
[i][__]:    navmesh.cpp: 692: Found Zip:C:/Users/tfsch/Documents/Wolfire/Overgrowth/Data/LevelNavmeshes/Project60/4_red_hills.nav.zip
[i][__]:    navmesh.cpp: 833: Navmesh: Adding mesh to sample...
[i][__]:   zip_util.cpp: 412: 0
[i][__]:   zip_util.cpp: 413: 959888
[i][__]:      error.cpp: 182: Displaying message: Error, Error reading from UnZip file
[i][__]:      error.cpp: 183: D:\SourceCode\repos\C++\overgrowth\Source\Utility\stacktrace.h(70)
D:\SourceCode\repos\C++\overgrowth\Source\Internal\error.cpp(183)
D:\SourceCode\repos\C++\overgrowth\Source\Internal\error.cpp(54)
D:\SourceCode\repos\C++\overgrowth\Source\Internal\zip_util.cpp(420)
D:\SourceCode\repos\C++\overgrowth\Source\Internal\zip_util.cpp(445)
D:\SourceCode\repos\C++\overgrowth\Source\AI\navmesh.cpp(843)
D:\SourceCode\repos\C++\overgrowth\Source\AI\navmesh.cpp(705)
D:\SourceCode\repos\C++\overgrowth\Source\Main\scenegraph.cpp(1220)
D:\SourceCode\repos\C++\overgrowth\Source\XML\level_loader.cpp(482)
D:\SourceCode\repos\C++\overgrowth\Source\Main\engine.cpp(5524)
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\type_traits(1503)
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\functional(684)
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\functional(1962)
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\functional(1999)
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\type_traits(1479)
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\thread(56)
Couldn't get line numbers. Possibly an external lib
Couldn't get line numbers. Possibly an external lib
Couldn't get line numbers. Possibly an external lib

The zip_util: 412 and 413 lines are debug lines that I've added that print out the version and the uncompressed_size of the zip file, implying that the outer file itself is being loaded at least somewhat properly. Investigating the cache files themselves, they seem to be a zip file contained within another zip file; the outermost zip file can be opened to reveal the contents, but the internal zip file can't seem to be opened neither by Windows nor 7zip. Is this internal file supposed to be a zip file, or is it some other internal format that gets decoded by Overgrowth?

Based on the fact that you can't interpret the files with an external program, that seems to imply to me that there might be problems with the saving in conjunction with the loading of the zip files. Does minizip have any way to query save/load errors?

ipro4911 commented 2 years ago

Damn didnt know nice explanation,i hope there will be a fixed dropped soon so i can start testing internal testing

redruin1 commented 2 years ago

The more I look at this issue the stranger it becomes. It seems to actually be an issue with zlib, or at least the way that minizip interfaces with it. The faulty code is in zip_util.cpp under UnZipFile::ExtractAll, which when boiled down looks like this:

unzGoToFirstFile(uf);
do {
    int result;
    char filename_buf[256];

    // This succeeds (though technically it can also fail and should probably be checked as well)
    unz_file_info file_info;
    unzGetCurrentFileInfo(uf, &file_info, filename_buf, 256, NULL, 0, NULL, 0); 

    // bookkeeping omitted

    // This is what fails
    // Before, this function wasn't checked for success, which I've changed like so
    result = unzOpenCurrentFile(uf); 
    if (result != UNZ_OK) {
        FatalError("Error", "Unable to open current file (Error code: %d)", result); // returns -2
    }

    int bytes_read = 0;
    do {
        // Of course, this then fails since there is no open file to read from
        bytes_read = unzReadCurrentFile(uf, scoped_buf.ptr, size_buf); 
        if (bytes_read < 0) {
            FatalError("Error", "Error reading from UnZip file (Error code: %d)", bytes_read); // returns -102
        } else if (bytes_read > 0) {
            // Perform normal operations
        }
    } while (bytes_read > 0);
}

unzOpenCurrentFile returns -2, which is peculiar in it's own right, since it means that this isn't a minizip error but rather a zlib error, specifically a Z_STREAM_ERROR. Inspecting the body of unzOpenCurrentFile in unzip.c, the only place that a -2 error code can ever be returned is in this section:

    if ((s->cur_file_info.compression_method==Z_BZIP2ED) && (!raw))
    {
#ifdef HAVE_BZIP2 // This is disabled on my setup, removing for brevity
      // This area _could_ return a -2 error code as well, just not under my current build
#else
      pfile_in_zip_read_info->raw=1;
#endif
    }
    else if ((s->cur_file_info.compression_method==Z_DEFLATED) && (!raw))
    {
      pfile_in_zip_read_info->stream.zalloc = (alloc_func)0;
      pfile_in_zip_read_info->stream.zfree = (free_func)0;
      pfile_in_zip_read_info->stream.opaque = (voidpf)0;
      pfile_in_zip_read_info->stream.next_in = 0;
      pfile_in_zip_read_info->stream.avail_in = 0;

      err=inflateInit2(&pfile_in_zip_read_info->stream, -MAX_WBITS);
      if (err == Z_OK)
        pfile_in_zip_read_info->stream_initialised=Z_DEFLATED;
      else
      {
        //printf("%d\n", err); // including this here does confirm that this is the source of the -2 code
        TRYFREE(pfile_in_zip_read_info);
        return err;
      }
    }

So the actual culprit seems to be inflateInit2. The zlib manual suggests that this particular error is caused "if a parameter is invalid, such as a null pointer to the structure." I've checked both pfile_in_zip_read_info and pfile_in_zip_read_info->stream, and neither are null... changing MAX_WBITS doesn't seem helpful either. Was there any recent changes to zip_util.cpp, minizip or zlib that could explain this read failure?

I investigated the saving code of the navmesh to make sure I didn't see anything funky, and all seems to be in order; I was able to switch back to a non-beta version of Overgrowth and load the cache files generated by internal_testing, which shows that there's only really a problem with getting the data out from the zip files once their loaded. Maybe this is a Windows-specific error?

redruin1 commented 2 years ago

Maybe this is a Windows-specific error?

Tested on Ubuntu LTS 20.04 with my fork, and lo and behold the issue is magically resolved. I suspected that this was because the Ubuntu build seems to use the OS version of zlib (1.2.11) instead of the prebuilt Windows one in Libraries (1.2.7), so I tried building a set of updated Windows dll and lib files for version 1.2.11 to see if that fixes it.

The answer: No! Still returns a -2 error code on unzOpenCurrentFile... though it could be that I just messed up royally somewhere along the way updating the library. The only other avenue I have at this point is that gzip seems to include it's own 1.2.12 version of zlib in addition to the 1.2.7 one provided alongside; you can verify this by adding:

LOGI << ZLIB_VERSION << std::endl; // "1.2.7"
LOGI << zlibVersion() << std::endl; // "1.2.12"

Maybe there's some weird conflict here?

@autious @feliwir @Conan-Kudo Do any of you guys have some insight on why internal_testing fails while the non-beta branch has no trouble on Windows?

autious commented 2 years ago

No, this is a new bug to me. Thanks for doing a detailed investigation of it though, so far you've managed to find some really helpful info.

shinymerlyn commented 2 years ago

I talked to Max about it a bit internally. This is due to the upgrade to libfreetype. It uses zlib internally, and caused a ripple effect in minizip. This is what is causing the internal error that results in the error code when trying to load the navmesh zip file.

Specifically, it was this commit that I believe triggered the problem: https://github.com/WolfireGames/overgrowth/commit/da170b903949bac27ef0133557bac93254910044

Going to need to figure out how to configure these two libraries to disentangle how they use zlib. I gave it a shot myself, but only very briefly. I am not an expert at configuring these libraries, so I was unable to get them to do the right thing. Someone who is more familiar, or has more time to go root through config documentation may be able to resolve this.

redruin1 commented 2 years ago

I think I've finally figured it out, primarily to @shinymerlyn's insight. By default on Windows, Freetype cannot find zlib, so it defaults to it's internal version, which then conflicts with the binary distribution provided in the Libraries folder (zlib1.2.7). However, if you coerce Freetype to load that 1.2.7 binary (so that only one version of zlib is being used), as far as I can tell this entirely fixes the problem. The following is my cursed hack:

# Starting at line 270 of Overgrowth's CMakeLists.txt:
ADD_SUBDIRECTORY(../Libraries/zstd-1.5.0/build/cmake zstd)
IF(WIN32)
    # Manually specify the location of zlib
    GET_FILENAME_COMPONENT( ZLIB_INCLUDE "../Libraries/zlib1.2.7/include" ABSOLUTE )
    GET_FILENAME_COMPONENT( ZLIB_LIBRARY "../Libraries/zlib1.2.7/lib/x64/zlib.lib" ABSOLUTE )
ENDIF()
ADD_SUBDIRECTORY(../Libraries/freetype-2.12.1 freetype)

We have to use get_filename_component to get the absolute paths because using local paths will result in an error. It also might be prudent to have the option FT_REQUIRE_ZLIB always enabled on Windows contexts, so it will never default to the zlib shipped with FreeType and cause this conflict. It might be possible to use the FreeType zlib exclusively instead, but the config options for minizip are somewhat lacking in comparison.

Also note that I manually specify the x64 version of the library arbitrarily; at this point I'm still unsure if a 32-bit distribution is a valid Overgrowth target, but if that's the case this section might need a little more logic to ensure that the correct library version is linked.

After doing this, everything seems to just work™:

image

The only other hiccup I had was when loading some levels, I was seemingly getting another iterator loop error:

// engine.cpp : line 4116
for (auto obj : scenegraph->objects_) { // maybe regular for loop as well?
    if (!obj->parent) {
        obj->PreDrawCamera(predraw_time);
    }
}

I changed this to a regular i++ loop which seemed to fix the problem. Though, given how frequent and pernicious these iterator issues are, I feel it would be smart to discuss why/how these issues are happening and if it would be smart to either revert all these engine level loops to regular integer loops (as a quick fix), figure out some way to make the iterator stable when being modified (even possible?), or figure out some way to separate modification and iteration into 2 distinct steps (even feasable?). Should I create an issue/discussion about it?

Another question: I issued another pull request that fixed all the PVS warnings associated with Overgrowth in this issue. Should I bundle these changes into there and push the entire thing, or would you like them as separate PRs?

Conan-Kudo commented 2 years ago

Separate PRs please.

shinymerlyn commented 1 year ago

how these issues are happening

if the list contents change while iterating it can break iterators. I imagine the iterations that call any script update functions will need to be swapped out or crashes may happen (whether in the base game, or in mods). there's functions to queue deleting items to help with problems like this, but they might not always be in use.

probably a different issue/pr tho.