albertan017 / LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

MIT License

2.75k stars 202 forks source link

cannot reproduce the results, is there anything wrong? #18

Open QiuJYWX opened 1 month ago

QiuJYWX commented 1 month ago

!/bin/bash

CUDA_VISIBLE_DEVICES=0,1 python ../evaluation/run_evaluation_llm4decompile_vllm.py \ --model_path ../../LLM/llm4decompile-6.7b-v1.5 \ --testset_path ../decompile-eval/decompile-eval.json \ --gpus 2 \ --max_total_tokens 2048 \ --max_new_tokens 2000 \ --repeat 1 \ --num_workers 32 \ --gpu_memory_utilization 0.82 \ --temperature 0

Optimization O0: Compile Rate: 0.9268, Run Rate: 0.5488 Optimization O1: Compile Rate: 0.9268, Run Rate: 0.3598 Optimization O2: Compile Rate: 0.8902, Run Rate: 0.3537 Optimization O3: Compile Rate: 0.8902, Run Rate: 0.3171

albertan017 commented 1 month ago

Thanks for testing the code. Please use the decompile-eval-executable-gcc-obj.json. All the evaluations and models are based on executable, which is different from our previous setting (object file, not linked).

Updates

[2023-05-16]: Please use decompile-eval-executable-gcc-obj.json. The source codes are compiled into executable binaries and disassembled into assembly instructions.

QiuJYWX commented 1 month ago

Thx for the reply, will try again.

QiuJYWX commented 2 weeks ago

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

albertan017 commented 2 weeks ago

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Yes, deepseek-coder-v2 demonstrates a strong ability to decompile binaries, achieving decompilation results comparable to those of GPT-4o (avg 15% on HumanEval-Decompile). Our efforts are ongoing for the llm4decompile-ref (which achieves much better results compared to directly decompile). While we are not working with the 236B version, it is far beyond our budget.

Cheliosoops commented 2 days ago

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Can you tell me the deepseek-coder-v2 have focused on the different compilation optimization level. This should be the shining point of this work. Thx.