CheckPointSW / Karta

Karta - source code assisted fast binary matching plugin for IDA
MIT License
862 stars 103 forks source link

thumbs_up_ELF crashing on ARM binary #25

Closed MrPeck closed 5 years ago

MrPeck commented 5 years ago

When I run thumbs_up_ELF on a ARM 32 bit binary I get the following exception:

C:\Users\pedro.peck\Desktop\Karta\src\thumbs_up\thumbs_up_ELF.py: Expected 2D array, got 1D array instead: array=[]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample. Traceback (most recent call last): File "C:\Program Files\IDA 7.2\python\ida_idaapi.py", line 572, in IDAPython_ExecScript execfile(script, g) File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up/thumbs_up_ELF.py", line 186, in main() File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up/thumbs_up_ELF.py", line 178, in main result = analysisStart(analyzer, code_segments, data_segments) File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up/thumbs_up_ELF.py", line 43, in analysisStart if not gatherIntel(analyzer, scs, sds): File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up\analyzer_utils.py", line 20, in gatherIntel if not analyzer.func_classifier.calibrateFunctionClassifier(scs): File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up\utils\function.py", line 217, in calibrateFunctionClassifier clf.fit(X_train, Y_train) File "C:\Python27\lib\site-packages\sklearn\ensemble\forest.py", line 250, in fit X = check_array(X, accept_sparse="csc", dtype=DTYPE) File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 552, in check_array "if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Let me know if anything is unclear! :)

chkp-eyalit commented 5 years ago

Hi, sorry for the delay I was out-of-office.

From the exception it looks like you have a single sample (simple function to train on), could you add some debug prints and check if this is indeed the scenario? How many functions are sent to the classifier for calibration?

While I probably need to update the code to better handle this case, I can't believe that any meaningful training could be made on a sample set of a single sample...

chkp-eyalit commented 5 years ago

Added a better error handling.

If the issue persists even with a high amount of functions, please feel free to re-open the ticket.