twogood / unshield

Tool and library to extract CAB files from InstallShield installers
MIT License
338 stars 73 forks source link

file_group_count exceeds MAX_FILE_GROUP_COUNT #25

Open rodarima opened 9 years ago

rodarima commented 9 years ago

I'm getting trouble extracting a cabinet.

Version from header is 16:

[unshield_read_headers:226] Reading header from .hdr file 1.
[unshield_read_headers:303] Version 0x02000640 handled as major version 16
[unshield_get_cab_descriptor:81] Cabinet descriptor: 000d192a 002026f0 002026f0 00000168
[unshield_get_cab_descriptor:83] Directory count: 90
[unshield_get_cab_descriptor:84] File count: 20901

There are 20901 files.

Searching inside the setup.exe:

$ strings /mnt/1/setup.exe | grep -i shield
http://www.installshield.com/isetup/ProErrorCentral.asp?ErrorCode=%d : 0x%x&ErrorInfo=%s
InstallShieldPendingOperation
SOFTWARE\InstallShield\16.0\Professional
    name="InstallShield.Setup"
<description>InstallShield.Setup</description>

At lib/component.c on line 53, using gdb, self->file_group_count is 151, exceeding the MAX_FILE_GROUP_COUNT limit set as 71 in lib/cabfile.h at line 16.

52 self->file_group_count = READ_UINT16(p); p += 2;
53 if (self->file_group_count > MAX_FILE_GROUP_COUNT)
54     abort();

As seen in i6comp at line 34, the max file groups is 512.

33 #define CSFG_MAX 512
34 #define CSCP_MAX 512

Changing the MAX_FILE_GROUP_COUNT to 512 produces a segmentation fault:

0x00007ffff7bd59eb in get_unaligned_le32 (p=0x7ffffcd1eb10 <error: Cannot access memory at address 0x7ffffcd1eb10>) at ~/unshield/lib/internal.h:138
 138        return p[0] | p[1] << 8 | p[2] << 16 | p[3] << 24;
(gdb) bt
#0  0x00007ffff7bd59eb in get_unaligned_le32 (p=0x7ffffcd1eb10 <error: Cannot access memory at address 0x7ffffcd1eb10>) at ~/unshield/lib/internal.h:138
#1  0x00007ffff7bd605a in unshield_header_get_components (header=0x55555575a030) at ~/unshield/lib/libunshield.c:145
#2  0x00007ffff7bd66f6 in unshield_read_headers (unshield=0x555555759ff0, version=0xffffffff) at ~/unshield/lib/libunshield.c:317
#3  0x00007ffff7bd68e2 in unshield_open_force_version (filename=0x7fffffffe4a3 "/mnt/1/data1.cab", version=0xffffffff) at ~/unshield/lib/libunshield.c:369
#4  0x000055555555690c in main (argc=0x5, argv=0x7fffffffe138) at ~/unshield/src/unshield.c:578

However using the following "patch" on lib/component.c, allows to continue on big file_group_count, with the original MAX_FILE_GROUP_COUNT set at 71:

52   self->file_group_count = READ_UINT16(p); p += 2;
53   if (self->file_group_count > MAX_FILE_GROUP_COUNT)
54       printf("WARNING: self->file_group_count = %d\n", self->file_group_count);
55       if(self->file_group_count > 512) 
56           abort()

This patch allows me to extract the cabinet without error, but obviously I have no idea of why.

Using valgrind to check buffer overflow (count bigger than MAX), shows no errors:

$ valgrind src/unshield -d test x /mnt/1/data1.cab
...
          19911 files
==9926== 
==9926== HEAP SUMMARY:
==9926==     in use at exit: 0 bytes in 0 blocks
==9926==   total heap usage: 436,953 allocs, 436,953 frees, 8,169,910,049 bytes allocated
==9926== 
==9926== All heap blocks were freed -- no leaks are possible
==9926== 
==9926== For counts of detected and suppressed errors, rerun with: -v
==9926== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

And the number of files is the same as extracted, but not equal to the number of files in the header 20901:

$ find test/ -type f  | wc -l
19911

Cabinets are large, in total ~1GB. Any ideas?

twogood commented 9 years ago

Interesting. Are you able to share the installation files with me?

Increasing MAX_FILE_GROUP_COUNT will make unshield read more "file group" entries, and that will push "component" entries forward and that will cause a crash. These are listed by the g and c commands, respectively.

The reason less files are extracted than the file count can be found in the unshield_file_is_valid function. There are a number of cases where a file entry is not actually a file.

rodarima commented 9 years ago

I have sent the entire iso (by torrent) to your email.

Using the original MAX_FILE_GROUP_COUNT set as 71, I can list the groups and components:

$ src/unshield g /mnt/1/data1.cab
...
1648 file groups

$ src/unshield c /mnt/1/data1.cab
...
1445 components

But with the above warning:

WARNING: self->file_group_count = 151

The invalid files (shown by a improvised printf call in unshield_file_is_valid function), sum exactly the initial ones:

$ src/unshield l /mnt/1/data1.cab | grep 'Invalid file' | wc -l
990

$ echo '19911 + 990' | bc -l
20901

And the cases, classified by increasing number in the branches starting at 1:

$ src/unshield l /mnt/1/data1.cab | grep 'Reason' | sort | uniq -c
    384 Reason 3
    606 Reason 5

For the reason 3, fd->flags has the flag FILE_INVALID, and for the 5, fd->data_offset is NULL.

twogood commented 9 years ago

Got it, thanks! I'll see if I can get it working this weekend!

twogood commented 7 years ago

Looks like I didn't get it working that weekend... Due to personal time priorities/constraints I need a PR to fix this, if it's still an issue.

Demon000 commented 2 months ago

MAX_FILE_GROUP_COUNT seems to only be a limitation of the CabDescriptor, ie: there's a maximum of 70 entry points for the file group offset list. I think that abort() call can be safely removed.

I think me and @rodarima might have been analyzing the same files, something related to Renault. :D