Closed nneonneo closed 6 months ago
By the way, if backwards-compatibility will be an issue, it should be possible to restore most of the old APIs. However, the resulting design may be less clean.
By the way, if backwards-compatibility will be an issue, it should be possible to restore most of the old APIs. However, the resulting design may be less clean.
Current breaking changes look good to me at least. I was not responding to this PR because of sickness recently. Will do a thorough check after a week or so.
OK, at this point I'm fairly happy with the code. It's decently well-tested, the samples match the C code (and the output matches too), and the performance is good. It's ready to be reviewed.
OK, at this point I'm fairly happy with the code. It's decently well-tested, the samples match the C code (and the output matches too), and the performance is good. It's ready to be reviewed.
Thanks for your work and I will review it these days as finally I feel a bit better! By the way, could you illustrate your use case?
I am planning to integrate Unicorn into some Ghidra-based analysis tools. As Ghidra is written entirely in Java, it is much more efficient to use Java bindings than trying to go through Jython-ctypes.
I am planning to integrate Unicorn into some Ghidra-based analysis tools. As Ghidra is written entirely in Java, it is much more efficient to use Java bindings than trying to go through Jython-ctypes.
That's cool! At this moment I don't have too much to say about this PR because I haven't written java for a few months. I need to give it a try before giving more review but thanks again for your brilliant work!
Any news on this?
Any news on this?
Rushing on a conference deadline ;(
Will have a look once getting it done.
All break changes look good to me as Java binding is too old and buggy.
I have no big questions on this PR and thanks for this brilliant work!
I will be soon on traveling so please expect another a few days absent. I will get back asap. Thanks for your patience. ;)
No worries. I'll aim to get something working with Maven this week, and also look at integrating the (newly merged) reg2 API :)
OK, the bindings are migrated to use Maven. The last thing on my TODO list is to switch to the reg2 API, which should be pretty straightforward.
OK, that last commit (763d041) adds the reg2 API, which I think is the last issue that needed to be resolved. Should be good for a final review and hopefully merge :)
Some last things which probably need your help:
javah
seems outdated and could you replace it with javac -h
? I'm not 100% sure on this so I'm glad if you could help.Makefile
totally so that we don't rely on make
, which should enable it to compile on Windows. I see exec plugin allows us to execute any command and thus we can just compile unicorn libraries in pom.xml
like other bindings did. This actually relates to my point 1 I believe.At this moment, I'm experimenting with 1 and 3 to build dynamic libraries within pom.xml
and bundle it to jar and load it in unicorn class. Could you have a look at 2?
@wtdcode OK, it was easy to switch to javac -h
so I have done so. I also made a fix to const_generator
to only generate new data if there are changes, which prevents mvn
from having to rebuild everything every time.
Tests are failing, but I'm not sure it's my fault? The new test test_x86_0xff_lcall
fails due to invalid instruction, which makes sense since the patch marks certain instructions as being invalid. Should the test itself be fixed?
Ah sorry you are correct, the test should be fixed. xd
I have mostly addressed the problem of building the binaries and now I'm investigating how to package everything together.
Meanwhile, I sent a request to Maven Central here:
@wtdcode its been a while! Just checking: is there any plan to get this merged? Maybe we can merge it first and then worry about the build issues? It sounded like the build system itself was basically sorted out and we just wanted to get the package on Maven (which I fully support!).
Merry Christmas and a happy new year!
@wtdcode its been a while! Just checking: is there any plan to get this merged? Maybe we can merge it first and then worry about the build issues? It sounded like the build system itself was basically sorted out and we just wanted to get the package on Maven (which I fully support!).
Merry Christmas and a happy new year!
Sure, I think that's a good start.
Regarding the build system, the reason why I failed to get it to work is that I lost all my progress (including some other work related to other issues) when I migrated my data, sorry.
This is a complete rewrite of the Java bindings for Unicorn, focused on speed, correctness and feature parity.
The existing bindings have some significant shortcomings. An incomplete list:
reg_read
andreg_write
are very slow for the common use case (i.e. reading long-sized registers or smaller), as they have to box and unboxLong
s.reg_read
returns stack garbage in the high bits because the read value isn't zeroed. The entire JVM crashes if you try and read a vector register.This PR contributes a ground-up rewrite of the bindings. It brings the Java API up to par with Python feature-wise and substantially simplifies the hook implementation, enabling proper bounds-checked hooks.
The new implementation is much faster than the old one. As a point of comparison, the following code hook, executed 10,000,000 times on an x86
loop
instruction, takes 9.9s with the old bindings, but just 1.6s in the new bindings - a 6x performance improvement. A similar hook takes 38s to run with the Python bindings.The rewrite strives for compatibility with the previous API, but there are some breaking changes. It is possible to push closer to full backwards compatibility if required, at the cost of reintroducing some of the suboptimal designs. Here are the main points of breakage:
A lot of bugs are fixed with this implementation:
Several features are now enabled in the Java implementation:
Detailed list of backwards incompatible changes
EventMemHook
: addedtype
parameterReadHook
andWriteHook
and replaced both withMemHook
.type
added to both hooks,value
added to read hook.Unicorn
:public long eng
query(int)
returnslong
now, instead ofint
reg_read(int)
returnslong
now, instead ofObject
(Object
return replaced byreg_read(int, Object)
)hook_add(ReadHook, long, long, Object)
removedhook_add(WriteHook, long, long, Object)
removedhook_add(MemHook, long, long, Object)
takes a newint type
paramhook_add
methods now return along
instead ofvoid
mem_map_ptr
changed to takelong, Buffer, int
instead oflong, long, int, byte[]