anestisb / vdexExtractor

Tool to decompile & extract Android Dex bytecode from Vdex files
Apache License 2.0
1k stars 214 forks source link

Vastly different file sizes 0.5.0->0.5.1 #27

Closed IgorEisberg closed 6 years ago

IgorEisberg commented 6 years ago

It doesn't look right, as classes3 is always the smallest of the three. Perhaps there's some data intersection? untitled Vdex file: http://www.mediafire.com/file/pn2747p8aaoq7lv/boot-framework.vdex/file

anestisb commented 6 years ago

This is expected with the current status of the CompactDex exports. The 0.5.0 release was not properly including all the necessary data sections, thus I've adjusted the file exporter logic to include the necessary regions from the shared data section of the matching Vdex container.

I've missed this bug since I was processing the CompactDex files from within the mmaped Vdex file, so the shared data section was already loaded with all the data sections. However, when I was experimenting yesterday to offline process the Cdex exports with a dexlayout binary I'm working, I noticed the missing sections.

The explanation behind this reverse exponential size growth is related to how the deduplication logic works. The first Dex file in a Vdex container has no deduplicated items since its the first occurrence of all data structs (Strings, CodeItems, etc.). Therefore the data size is the same as the original Dex. However, the 2nd Dex file has new data items (owned data section) and deduplicated items that are present in the owned data section of the 1st Dex file. Therefore, to not miss any data structs we need to copy the owned data sections of both Dex files. For the 3rd Dex file, the previous 2 owned data sections and the current, and so on.

I've improved the debugging logic in recent commits so you can easily verify by observing the sizes of the data sections of each CompactDex file. For the Google photos app for example the data size of the 1st Cdex is 602a48, the 2nd aca1a8 and the 3rd cacbc8 (the entire shared data container).

Hopefully, the dexlayout IR will remove the dead items when converting the IR back to StandardDex. Otherwise, ROM developers will have a huge perf/size impact.

Now all the exported Cdex files can be processed with the dexdump2 utility as compiled from AOSP sources (not the SDK one). Now I'm more confident the ART file parsers are happy with the decompiled Cdex files that the tool exports, so I can continue with the dexlayout extensions

vdexExtractor -i ~/Desktop/vdex_019/Photos.vdex -o . -f -v 4
[INFO] Processing 1 file(s) from /Users/anestisb/Desktop/vdex_019/Photos.vdex
[DEBUG] [5595] 2018/09/01 11:36:08 (vdexExtractor.c:194 main) Processing '/Users/anestisb/Desktop/vdex_019/Photos.vdex'
[DEBUG] [5595] 2018/09/01 11:36:08 (vdex_api.c:46 vdexApi_initEnv) Initializing environment for Vdex version '019'
------ Vdex Header Info -------
magic header                  : vdex
verifier dependencies version : 019
dex section version           : 002
number of dex files           : 3 (3)
verifier dependencies size    : 1e6a8 (124584)
verifier dependencies offset  : 106df08 (17227528)
quickening info size          : 15fa90 (1440400)
quickening info offset        : 108c5b0 (17352112)
dex section header offset     : 20 (32)
dex size                      : 3c1314 (3937044)
dex shared data size          : cacbc8 (13290440)
dex files info                :
  [0] location checksum : 5a6fe63 (94830179)
  [1] location checksum : 9121f24f (2434921039)
  [2] location checksum : 91208c95 (2434829461)
---- EOF Vdex Header Info ----
[DEBUG] [5595] 2018/09/01 11:36:08 (vdex/vdex_019.c:196 vdex_019_GetNextDexFileData) Processing first Dex file at offset:0x30
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:600 dex_dumpHeaderInfo) ------ Dex Header Info ------
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:604 dex_dumpHeaderInfo) magic        : cdex-001
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:607 dex_dumpHeaderInfo) checksum     : dbd6c884 (3688286340)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:608 dex_dumpHeaderInfo) signature    : fa53df460880c2392b5f942e5da88d33e683d900
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:610 dex_dumpHeaderInfo) fileSize     : 19c840 (1689664)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:612 dex_dumpHeaderInfo) headerSize   : 88 (136)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:614 dex_dumpHeaderInfo) endianTag    : 12345678 (305419896)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:616 dex_dumpHeaderInfo) linkSize     : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:618 dex_dumpHeaderInfo) linkOff      : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:620 dex_dumpHeaderInfo) mapOff       : 5d9a28 (6134312)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:622 dex_dumpHeaderInfo) stringIdsSize: 9c11 (39953)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:624 dex_dumpHeaderInfo) stringIdsOff : 88 (136)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:626 dex_dumpHeaderInfo) typeIdsSize  : 4597 (17815)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:628 dex_dumpHeaderInfo) typeIdsOff   : 270cc (159948)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:630 dex_dumpHeaderInfo) protoIdsSize : 3428 (13352)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:632 dex_dumpHeaderInfo) protoIdsOff  : 38728 (231208)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:634 dex_dumpHeaderInfo) fieldIdsSize : b19c (45468)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:636 dex_dumpHeaderInfo) fieldIdsOff  : 5f908 (391432)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:638 dex_dumpHeaderInfo) methodIdsSize: fff7 (65527)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:640 dex_dumpHeaderInfo) methodIdsOff : b85e8 (755176)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:642 dex_dumpHeaderInfo) classDefsSize: 3215 (12821)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:644 dex_dumpHeaderInfo) classDefsOff : 1385a0 (1279392)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:646 dex_dumpHeaderInfo) dataSize     : 602a48 (6302280)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:648 dex_dumpHeaderInfo) dataOff      : 3c1310 (3937040)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:652 dex_dumpHeaderInfo) featureFlags                : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:654 dex_dumpHeaderInfo) debuginfoOffsetsPos         : 5d9af8 (6134520)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:656 dex_dumpHeaderInfo) debugInfoOffsetsTableOffset : 24f4c (151372)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:658 dex_dumpHeaderInfo) debugInfoBase               : 42c2cc (4375244)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:660 dex_dumpHeaderInfo) ownedDataBegin              : 8 (8)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:662 dex_dumpHeaderInfo) ownedDataEnd                : 602a48 (6302280)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:665 dex_dumpHeaderInfo) -----------------------------
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:375 dex_isValidCDex) CompactDex version '001' detected
[DEBUG] [5595] 2018/09/01 11:36:08 (vdex/vdex_019.c:225 vdex_019_GetNextDexFileData) Processing Dex file at offset:0x19c870
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:600 dex_dumpHeaderInfo) ------ Dex Header Info ------
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:604 dex_dumpHeaderInfo) magic        : cdex-001
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:607 dex_dumpHeaderInfo) checksum     : 6fda48c0 (1876576448)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:608 dex_dumpHeaderInfo) signature    : 8cb6b33870e3456d326e3b1e5f1fcdc773fd6dae
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:610 dex_dumpHeaderInfo) fileSize     : 195000 (1658880)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:612 dex_dumpHeaderInfo) headerSize   : 88 (136)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:614 dex_dumpHeaderInfo) endianTag    : 12345678 (305419896)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:616 dex_dumpHeaderInfo) linkSize     : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:618 dex_dumpHeaderInfo) linkOff      : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:620 dex_dumpHeaderInfo) mapOff       : a9e174 (11133300)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:622 dex_dumpHeaderInfo) stringIdsSize: 90a1 (37025)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:624 dex_dumpHeaderInfo) stringIdsOff : 88 (136)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:626 dex_dumpHeaderInfo) typeIdsSize  : 4fde (20446)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:628 dex_dumpHeaderInfo) typeIdsOff   : 2430c (148236)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:630 dex_dumpHeaderInfo) protoIdsSize : 2dd9 (11737)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:632 dex_dumpHeaderInfo) protoIdsOff  : 38284 (230020)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:634 dex_dumpHeaderInfo) fieldIdsSize : a4ee (42222)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:636 dex_dumpHeaderInfo) fieldIdsOff  : 5a8b0 (370864)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:638 dex_dumpHeaderInfo) methodIdsSize: fff4 (65524)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:640 dex_dumpHeaderInfo) methodIdsOff : ad020 (708640)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:642 dex_dumpHeaderInfo) classDefsSize: 3402 (13314)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:644 dex_dumpHeaderInfo) classDefsOff : 12cfc0 (1232832)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:646 dex_dumpHeaderInfo) dataSize     : aca1a8 (11313576)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:648 dex_dumpHeaderInfo) dataOff      : 224acc (2247372)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:652 dex_dumpHeaderInfo) featureFlags                : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:654 dex_dumpHeaderInfo) debuginfoOffsetsPos         : a9e244 (11133508)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:656 dex_dumpHeaderInfo) debugInfoOffsetsTableOffset : 27f60 (163680)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:658 dex_dumpHeaderInfo) debugInfoBase               : 42c2cc (4375244)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:660 dex_dumpHeaderInfo) ownedDataBegin              : 602a48 (6302280)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:662 dex_dumpHeaderInfo) ownedDataEnd                : aca1a8 (11313576)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:665 dex_dumpHeaderInfo) -----------------------------
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:375 dex_isValidCDex) CompactDex version '001' detected
[DEBUG] [5595] 2018/09/01 11:36:08 (vdex/vdex_019.c:223 vdex_019_GetNextDexFileData) Processing last Dex file at offset:0x331874
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:600 dex_dumpHeaderInfo) ------ Dex Header Info ------
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:604 dex_dumpHeaderInfo) magic        : cdex-001
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:607 dex_dumpHeaderInfo) checksum     : bf3670e4 (3208016100)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:608 dex_dumpHeaderInfo) signature    : 6ddd4131cdccbc7d5674e2552dd2360740699d73
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:610 dex_dumpHeaderInfo) fileSize     : 8fac8 (588488)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:612 dex_dumpHeaderInfo) headerSize   : 88 (136)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:614 dex_dumpHeaderInfo) endianTag    : 12345678 (305419896)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:616 dex_dumpHeaderInfo) linkSize     : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:618 dex_dumpHeaderInfo) linkOff      : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:620 dex_dumpHeaderInfo) mapOff       : c9e088 (13230216)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:622 dex_dumpHeaderInfo) stringIdsSize: 3b7e (15230)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:624 dex_dumpHeaderInfo) stringIdsOff : 88 (136)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:626 dex_dumpHeaderInfo) typeIdsSize  : 1a75 (6773)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:628 dex_dumpHeaderInfo) typeIdsOff   : ee80 (61056)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:630 dex_dumpHeaderInfo) protoIdsSize : 1578 (5496)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:632 dex_dumpHeaderInfo) protoIdsOff  : 15854 (88148)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:634 dex_dumpHeaderInfo) fieldIdsSize : 386c (14444)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:636 dex_dumpHeaderInfo) fieldIdsOff  : 259f4 (154100)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:638 dex_dumpHeaderInfo) methodIdsSize: 59f6 (23030)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:640 dex_dumpHeaderInfo) methodIdsOff : 41d54 (269652)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:642 dex_dumpHeaderInfo) classDefsSize: 106e (4206)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:644 dex_dumpHeaderInfo) classDefsOff : 6ed04 (453892)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:646 dex_dumpHeaderInfo) dataSize     : cacbc8 (13290440)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:648 dex_dumpHeaderInfo) dataOff      : 8fac8 (588488)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:652 dex_dumpHeaderInfo) featureFlags                : 0 (0)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:654 dex_dumpHeaderInfo) debuginfoOffsetsPos         : c9e158 (13230424)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:656 dex_dumpHeaderInfo) debugInfoOffsetsTableOffset : d3ec (54252)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:658 dex_dumpHeaderInfo) debugInfoBase               : 42c2d5 (4375253)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:660 dex_dumpHeaderInfo) ownedDataBegin              : aca1a8 (11313576)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:662 dex_dumpHeaderInfo) ownedDataEnd                : cacbc8 (13290440)
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:665 dex_dumpHeaderInfo) -----------------------------
[DEBUG] [5595] 2018/09/01 11:36:08 (dex.c:375 dex_isValidCDex) CompactDex version '001' detected
[DEBUG] [5595] 2018/09/01 11:36:08 (vdex/vdex_019.c:277 vdex_019_process) Took 349 ms to process Vdex file
[INFO] 1 out of 1 Vdex files have been processed
[INFO] 3 Dex files have been extracted in total
[INFO] Extracted Dex files are available in '.'
IgorEisberg commented 6 years ago

Thanks for the detailed explanation, now it makes more sense.

anestisb commented 6 years ago

As described in https://github.com/anestisb/vdexExtractor/issues/23#issuecomment-417914815 the recent changes are working the libdexlayout IR. So closing this is issue, since there is nothing to track.