octopus-platform / joern-tools

Python utilities for joern
GNU General Public License v3.0
35 stars 16 forks source link

joern-apiembedder and joern-knn #4

Closed hahnakane closed 10 years ago

hahnakane commented 10 years ago

When finding the nearest neighbor in the VLC tutorial, I execute the following example:

joern-list-funcs -p VLCEyeTVPluginInitialize | awk -F "\t" '{print $2}' | joern-knn

I'm presented with the error in _svmlight_format.c line 2505, "ValueError: Feature indices in SVMlight/LibSVM data file should be sorted and unique.

I've identified that when the SVM file is created by joern-apiembedder, the key-value pairs are in descending order by key; therefore, when the keys are parsed, the current index will be less than the previous index. Thus, the error is triggered.

For example, a line from the SVM file is:

3382 1:1.00000 0:1.00000 #3382

The current key, 0, is less than the previous key, 1, so the exceptions is thrown.

I get this error regardless of the source code repository I'm analyzing.

Thoughts on how to correct this, so that I can properly use joern-embedder and joern-knn together?

fabsx00 commented 10 years ago

Hm... I can't reproduce this here: when I use the embedder, the dimensions are sorted correctly. Does this commit maybe fix the problem for you? : https://github.com/fabsx00/joern-tools/commit/c49d7436cfacba09b44c76d57147e2a0d094aa80

hahnakane commented 10 years ago

Awesome! It works with this fix.

I appreciate the quick response!