Open QiuJYWX opened 1 month ago
Thanks for testing the code. Please use the decompile-eval-executable-gcc-obj.json. All the evaluations and models are based on executable, which is different from our previous setting (object file, not linked).
decompile-eval-executable-gcc-obj.json
. The source codes are compiled into executable binaries and disassembled into assembly instructions.Thx for the reply, will try again.
Hi @albertan017 ,
Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.
Hi @albertan017 ,
Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.
Yes, deepseek-coder-v2 demonstrates a strong ability to decompile binaries, achieving decompilation results comparable to those of GPT-4o (avg 15% on HumanEval-Decompile). Our efforts are ongoing for the llm4decompile-ref (which achieves much better results compared to directly decompile). While we are not working with the 236B version, it is far beyond our budget.
Hi @albertan017 ,
Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.
Can you tell me the deepseek-coder-v2 have focused on the different compilation optimization level. This should be the shining point of this work. Thx.
!/bin/bash
CUDA_VISIBLE_DEVICES=0,1 python ../evaluation/run_evaluation_llm4decompile_vllm.py \ --model_path ../../LLM/llm4decompile-6.7b-v1.5 \ --testset_path ../decompile-eval/decompile-eval.json \ --gpus 2 \ --max_total_tokens 2048 \ --max_new_tokens 2000 \ --repeat 1 \ --num_workers 32 \ --gpu_memory_utilization 0.82 \ --temperature 0
Optimization O0: Compile Rate: 0.9268, Run Rate: 0.5488 Optimization O1: Compile Rate: 0.9268, Run Rate: 0.3598 Optimization O2: Compile Rate: 0.8902, Run Rate: 0.3537 Optimization O3: Compile Rate: 0.8902, Run Rate: 0.3171