lief-project / LIEF

LIEF - Library to Instrument Executable Formats (C++, Python, Rust)
https://lief.re
Apache License 2.0
4.48k stars 621 forks source link

DEX parser returns classes that are not defined #302

Open packmad opened 5 years ago

packmad commented 5 years ago

Describe the bug The field classes of the class lief.DEX.File returned by lief.DEX.parse contains classes that are not defined in the DEX file, but also classes that are used in parameters, fields, etc.

To Reproduce Steps to reproduce the behavior: analyzing the attached file bug.zip with the following code:

dex = lief.DEX.parse(dex_file)
for c in dex.classes:
    print(c.fullname)

For example, this code prints the class Landroid/service/autofill/SaveRequest;. But the aforementioned class is not defined in the dex file (confirmed by dexdump and radare).

Expected behavior There's the need to distinguish between classes defined in the DEX file and classes without implementation.

Environment:

romainthomas commented 5 years ago

Hello @packmad You are right, in the current version we don't distinguish external classes from internal ones. One way to workaround this issue is to check if at least one of the method of the class has a code size > 0 (and the class is not abstract)

packmad commented 5 years ago

It looks like this code is doing what I wanted (I'm also intrested in abs classes):

def get_internal_classes(dex) -> Set[str]:
    ret = set()
    for c in dex.classes:
        for m in c.methods:
            if len(m.bytecode) > 0:
                ret.add(c.fullname)
                break
    return ret