twogood / unshield

Tool and library to extract CAB files from InstallShield installers
MIT License
348 stars 75 forks source link

Windows builds do not handle 2+ GB .cab files properly #103

Open kpyrkosz opened 4 years ago

kpyrkosz commented 4 years ago

Hello. I've built the unshield on linux and extracted a 3GB big .cab without problems, on windows the unpacking process stops with error at about 70%. The problematic part is the size used by fseek and ftell is of type long, which on windows (both 32 and 64 bit) is defined as a 4 bit type, but reader->file_descriptor->data_offset is an unit64_t. I've tried patching both functions to _fseeki64 and _ftelli64 (these are windows-exclusive if I'm not mistaken) and then the big cabfile is properly extracted on windows.

twogood commented 4 years ago

Thanks for discovering this! The Windows build is a pain with all these build systems and if I make unshield build with one of them, it seems another one breaks. But I digress. It should be pretty easy to make a PR for this I hope.

kpyrkosz commented 4 years ago

I can try to refactor the CMake buildscripts, they look bit messy and non standard conformant. There are many helpful features that would make configuration and build cleaner and portable. I've got a couple of questions before i start: -why are the buildscripts enforcing static linkage of MSVC builds? There is a ton of IFs that control the static/shared switch in most of the project's CMakeLists.txt. I believe it is easly configurable in CMake -is there any benefit of building the libunshield as shared? -current minimum version of required cmake is 2.8.7, it is really outdated, version 3+ is available on all modern platforms. -path /var/tmp/unshield is hardcoded all over the place in the tests, it's not really portable and unusable on windows. CMake can generate a test target and manage all prefixes/directories for us

twogood commented 4 years ago

@kpyrkosz I'd really appreciate it! The CMake stuff is contributed to replace my own autofoo scripts, and then slightly improved and somewhat broken... :/ And the AppVeyor build is broken: https://ci.appveyor.com/project/twogood/unshield

  1. No idea about the static linking. Maybe because of zlib?
  2. In theory other applications should be able to link to libunshield but I guess it has never happened in practice
  3. I think it was because the build environment on Travis CI had a really old CMake but v3+ should work now
  4. I don't expect the tests to run on Windows or part of the build. They also use curl and unzip ;)

    See also unmerged PRs #57 and #91

twogood commented 4 years ago

zlib is the big pain on Windows. I'm open to include its source code with unshield to make the build self-contained.

kpyrkosz commented 4 years ago

The branch https://github.com/kpyrkosz/unshield/tree/wip/updating-cmake contains minimal modifications which make windows build the program and not break linux build. Tested with msvc 2019, clang 12 and mingw gcc 6.3. The current build does not handle the big cabfiles yet, first I'd like to achieve portability and write sensible CI scripts (I have never used appveyor nor travis but always wanted to learn them, so that my chance to kill two birds with one stone :) ), then do the fixing. So far I have: -removed -fPIC flag (it's not portable) -changed minimum cmake to 3.6 (from what I've read in the docs that's the lowest that handles imported targets for zlib and openssl) -dumped the static msvc part -refactored libunshield CMake script -removed typedef entries from unistd.h, they are already known by all the compilers (and cause an redefinition error in case of clang) and corrected ifdef to handle entire windows builds (WIN32) instead of only MINGW32

I am not issuing pull request yet, I'd like to have the CI up and running first.

twogood commented 4 years ago

@kpyrkosz I created pull request #105 to make it build on Travis and AppVeyor. Travis build it really fast on Linux and while I have no idea what the Mac OS X builds are spending time on. AppVeyor claims success but it actually failed.

twogood commented 4 years ago

@kpyrkosz Please disable Mac OS X builds in .travis.yml on your branch, I'm working on updating that in another PR

kpyrkosz commented 4 years ago

The problem with windows CI is the fact that zlib is not preinstalled on the travis machines and the built-in package manager... does not support the zlib. Personally i am using vcpkg on win and it works wonders, supporting multiple configurations, architectures with flawless CMake integration. It turns out that it is present by default on appveyor, is it a good idea to test for linux on travis exclusively and windows on appveyor? I've been pushing with travis for quite some time and I can't "cleanly" achieve what i planned to (I suppose it is doable but I'm just new to the tool). My vision is more or less: -linux gcc/clang debug/release CFLAGS (cmake on linux does not generate Release/Debug configurations, instead CMake honors well known env vars like CC, CXX, CFLAGS, LDFLAGS and so on) - 4 configurations in total -windows clang/msvc cmake release/debug config (on windows Visual Studio generator creates configurations that you specify like cmake --build . --config Release) - 4 configurations in total -windows mingw gcc, similarly to linux you need to specify CFLAGS env var for debug/release builds and call initial cmake with appropriate generator -G"MinGW Makefiles" - 2 configurations in total There is also this switch USE_OUR_OWN_MD5 which i would like to mix with every of the mentioned configurations (so from 10 we'd get 20, each with the define being 0 or 1), but i just can't see how to achieve that.

twogood commented 4 years ago

@kpyrkosz I guess that's why the AppVeyor configuration uses scoop to install zlib and not chocolatey like Travis. But if you can use vcpkg om both Travis and AppVeyor it would be nice! Or we only use Travis and not AppVeyor.

I like how you think about the combinations of builds.

twogood commented 4 years ago

But for Windows, maybe the most important thing is to be able to provide an unshield.exe to people who do not happen to have a dev environment setup :)

twogood commented 4 years ago

@kpyrkosz I probably caused a merge conflict with your .travis.yml now but I can sort that out later

kpyrkosz commented 4 years ago

I spent literally entire day pushing with appveyor and i got nowhere lol. I am so mad.

twogood commented 4 years ago

@kpyrkosz Sorry to hear that! Maybe we should try to get zlib into Travis Windows environment?

kpyrkosz commented 4 years ago

I've written appveyor script for windows builds but vcpkg has temporary issue with some zlib dependency... https://github.com/microsoft/vcpkg/issues/13217 The msys repos seem to be unreachable at the moment but I wouldn't like to write any hotfixes to third party tools. I mean i could but that's windows, there's no find nor sed.

kpyrkosz commented 4 years ago

Hey it's both fast and green :) Linux: https://travis-ci.com/github/kpyrkosz/unshield/builds/187972769 Win: https://ci.appveyor.com/project/kpyrkosz/unshield/builds/35551986

On Linux tests are enabled and passing in all configurations, on windows all 3 compilers are building 4 variants variants each but i wasn't running the test suite in the CI yet. I'm using git bash on my win machine, the test suite is runnable after tiny patches but for some reason md5sum prepends paths with a star (I didn't dig too deep why is that, but the tests fail because of diff returning nonzero code). Example: -596e93bd517f9d9056db083e1a445c63 ./Program_Executable_Files/CLIENT.EXE +596e93bd517f9d9056db083e1a445c63 *./Program_Executable_Files/CLIENT.EXE

twogood commented 4 years ago

Great job!!!!!!

The CVE-2015-1386 directory traversal test will fail if not executed on GNU libc, I don't have a good solution for other libc's yet.

The star is actually mentioned in the man page:

The default mode is to print a line with checksum, a space, a character indicating input mode ('*' for binary, ' ' for text or where binary is insignificant), and name for each FILE.

So I guess that -b should be used on all operating system and that the '*' is the more correct version.

twogood commented 4 years ago

Wow, the number of warnings from clang on Windows...

kpyrkosz commented 4 years ago

That's clang's -Weverything flag in debug mode. Should we add -pedantic on top of it?

According to man md5sum default behavior is text mode -t on linux, inside git bash it seems it is -b (I say "it seems" because there is no man inside windows git bash). The -b switch prepends the star on both OSes. Shouldn't -b be used in our scenario? I suspect problems may arise with CRLF and such. After all these files are extracted and saved as binary files. I've quickly locally patched tests invocation of md5sum to md5sum -t to imitate linux behavior and all passed with mingw build on windows!

twogood commented 4 years ago

Oh -pedantic<3

Yes, the correct would be to use -b. But I would be surprised if the MD5 sum changed. I assume it controls the b flag to fopen():

The mode string can also include the letter 'b' either as a last character or as a character between the characters in any of the two-character strings described above. This is strictly for compatibility with C89 and has no effect; the 'b' is ignored on all POSIX conforming systems, including Linux. (Other systems may treat text files and binary files differently, and adding the 'b' may be a good idea if you do I/O to a binary file and expect that your program may be ported to non-UNIX environments.)

kpyrkosz commented 4 years ago

How do we merge your xcode and my linux builds in travis? I can port linux build to my appveyor script, that should work too.

twogood commented 4 years ago

I think you can probably look at my xcode changes and incorporate them easily in .travis.yml in your branch, otherwise I can look at it in a few days

wdlkmpx commented 3 years ago

https://github.com/kpyrkosz/unshield/commit/7ea602d78b86f6041ffc58103229be47610a8154

The branch has a merge conflict in CMakeLists.txt ... does fixing this issue require altering the build system?

Does not compile on linux.

I guess something like this could solve the issue:

#ifdef _WIN32
#define ftell _ftelli64
#define fseek _fseeki64
long long unshield_fsize(FILE* file);
#else
long unshield_fsize(FILE* file);
#endif
wdlkmpx commented 3 years ago

OpenSSL is a huge lib, there is no reason to use it if only md5 is required, the internal md5.c is more than enough.

OpenSSL only makes sense if using advanced encryption stuff, if the bundled md5.c works as expected, all references to openssl should be removed.

There are simple libs compatible with zlib that would be easy to integrate into the project, so that all deps are compiled statically, and cmake/windows stuff is simplified.

wdlkmpx commented 3 years ago

The appveyor Windows build config must be somewhere else, running a test in my repo, it complains that the config is missing or something, so I had to disable appveyor to keep the green check.

So far I managed to make all Travis UNIX builds work... FreeBSD, macOS and 4 linux arch's including s390x (big endian) https://travis-ci.com/github/wdlkmpx/unshield/builds/226030031

twogood commented 3 years ago

Appveyor failed builds do not appear as failed in GitHub. Must be something with the build script I guess?

wdlkmpx commented 3 years ago

I guess the appVeyor config is outside the repo, I think it needs the .appveyor file for changes to take effect. I added and removed AppVeyor from my github actions.

Travis Windows builds

mingw GCC 8 https://travis-ci.com/github/wdlkmpx/unshield/jobs/505735405

MSVC (Visual Studio 2017) throws many warnings with -Wall https://travis-ci.com/github/wdlkmpx/unshield/jobs/505735406

GCC and Clang provide proper C support, MSVC is a C++ compiler that only supported C89 for a long time, with Visual Studio 2017, it still doesn't properly support C99, probably comparable to gcc in 2001. So basically using GCC in all platforms with POSIX header compatibility is the "supported" use case, everything else is "whatever"...

wdlkmpx commented 3 years ago

I agree that that the minimum CMake support must be bumped to 3.x, the main CMakeLists.txt has this line: cmake_minimum_required(VERSION 2.8.7)

but lib/CMakeLists.txt has this... so yeah that is the true minimum version ... cmake 3.6... Debian Stretch has cmake 3.7.2 (2017) cmake_minimum_required(VERSION 3.6)

It's quite obvious that only recent cmake versions can do what the old autotools did back in 2005, I re-implemented autotools support but I'm yet add a commit to my repo, so with both build systems, support for old and new OSs is guaranteed. But only autotools provides easy cross-compilations, only with configure switches.

twogood commented 3 years ago

I thought we were on CMake 3 already... :)

wdlkmpx commented 3 years ago

This is crazy, I managed to run the test suite on Windows (run-tests.sh),

MinGW on Windows produces a proper unshield: https://travis-ci.com/github/wdlkmpx/unshield/jobs/506268982

MSVC produces a mostly OK unshield, only test/v0/wireplay.sh fails https://travis-ci.com/github/wdlkmpx/unshield/jobs/506268983

This is without iconv() support.

wdlkmpx commented 3 years ago

My travis build triggers 14 jobs for 4 operating systems and 4 processor architectures. This repo triggers 20 or 18 jobs for 2 operating system and 1 processor architecture.

My builds process way faster, lower resource usage per job than this repo. It's not a good idea to stress the Travis CI servers or something.

And now travis has apparently suspended builds for my repo, so a PR is incoming Screenshot(8)

nikitalita commented 2 years ago

This is crazy, I managed to run the test suite on Windows (run-tests.sh),

MinGW on Windows produces a proper unshield: https://travis-ci.com/github/wdlkmpx/unshield/jobs/506268982

MSVC produces a mostly OK unshield, only test/v0/wireplay.sh fails https://travis-ci.com/github/wdlkmpx/unshield/jobs/506268983

This is without iconv() support.

can you please upload artifacts from this?