The ability to simulate running code in a sandbox so that unknown classes and methods can be invoked without running potentially unsafe code (Like deleting files). For primitives, strings, and other safe types this would allow users to watch / step through the execution and watch the changes applied to classes (fields) and the method stack.
Use Cases
Understanding the behavior of unknown logic without executing potentially malicious code
Fetching values from simple methods
In combination with #151 could be used to automatically deobfuscate strings & other obfuscator patterns
Implementation Outline:
VirtualObj
T value
VirtualClass
Map<Identifier, VirtualField> fields
Map<Identifier, VirtualMethod> methods
VirtualMember
identity()
abstract, implemented by children. Used as the lookup key for VirtualField and VirtualMethod.
VirtualFieldextendsVirtualMember
VirtualClass value
VirtualMethodextendsVirtualMember
Simulation getSimulation(VirtualObj... args)
Generate Simulation object, does not run immediately
Simulation
List<VirtualInsn\> instructions
List<VirtualObj\> vars
Stack<VirtualObj\> stack
VirtualInsnmodels behavior fromAbstractInsnNode
apply(Simulation)
Applies modification to vars/stack as needed
Current thoughts on how the simulation API will be handled. There will probably be a lot of VirtualClass implementations for common core Java classes. At some point enough base classes should be implemented to have things referencing them work by auto-generated logic. For example, once String is implemented something like StringUtils should be able to have an auto-generated implementation work since all the outbound reference classes have a VirtualClass implementation.
With this lookup strategy, loading loops would have to be counted somehow. Perhaps a combination of lazy-loading and keeping a global cache for library classes (anything not residing in the primary Recaf input) would be worthwhile.
Also to reduce the complexity of certain simulations, being able to assign dummy values to things like field getters + method invokes may be a useful feature too.
Runtime simulation
The ability to simulate running code in a sandbox so that unknown classes and methods can be invoked without running potentially unsafe code (Like deleting files). For primitives, strings, and other safe types this would allow users to watch / step through the execution and watch the changes applied to classes (fields) and the method stack.
Use Cases
Implementation Outline:
VirtualObj
T value
VirtualClass
Map<Identifier, VirtualField> fields
Map<Identifier, VirtualMethod> methods
VirtualMember
identity()
VirtualField extends VirtualMember
VirtualClass value
VirtualMethod extends VirtualMember
Simulation getSimulation(VirtualObj... args)
Simulation
List<VirtualInsn\> instructions
List<VirtualObj\> vars
Stack<VirtualObj\> stack
VirtualInsn models behavior from AbstractInsnNode
apply(Simulation)
Current thoughts on how the simulation API will be handled. There will probably be a lot of
VirtualClass
implementations for common core Java classes. At some point enough base classes should be implemented to have things referencing them work by auto-generated logic. For example, onceString
is implemented something likeStringUtils
should be able to have an auto-generated implementation work since all the outbound reference classes have aVirtualClass
implementation.With this lookup strategy, loading loops would have to be counted somehow. Perhaps a combination of lazy-loading and keeping a global cache for library classes (anything not residing in the primary Recaf input) would be worthwhile.
Also to reduce the complexity of certain simulations, being able to assign dummy values to things like field getters + method invokes may be a useful feature too.