朋友你好基于你这个插件能不能增加一个功能，就是对伪代码进行格式化

ver007 commented 7 months ago

有时候我们写python的写的多看到这种伪代码不经常看c++的就会很难受，要是有个功能能直接格式化该代码就好了，我这边写了个格式化代码，但是没找到能更新IDA伪代码的方法，还请能帮忙增加，不胜感激

def clean_c_style(code):
    """
    格式化代码
    :param code: 待格式化的代码
    :return:
    """
    index, codes, out, up_line = 0, code.split("\n"), "", ""
    while index < len(codes):
        line, next_line = codes[index], codes[index + 1] if len(codes) > index + 1 else ""

        if line.startswith("  "):
            sp = re.search(r'^\s+', line).group(0)
            line = re.sub(r'^\s+', sp.replace("  ", "    "), line)

        line = re.sub(r'\s*$', '', line)

        tmp = re.sub(r'\s+', '', next_line)
        if tmp == "{":
            line += f" {next_line.strip()}"
            index += 1

        elif up_line == "else" or up_line.startswith("else if") and re.sub(r'\s+', '', line) == "{":
            line = " " + re.sub(r'\s+', '', line)
            out = re.sub(r'\s*$', '', out)
            up_line = ""

        elif tmp == "else" or tmp.startswith("elseif"):
            up_line = next_line.strip()
            line += f" {next_line.strip()}"
            index += 1

        out += line + "\n"
        index += 1
    return out

DearVa commented 7 months ago

我也没有找到修改伪代码的接口……我本来打算给这个插件添加自动重命名伪代码变量的功能，但我同样也没找到。

ver007 commented 7 months ago

修改变量、代码可以参考下面这个，我没写过IDA插件好多接口不熟悉 https://github.com/JusticeRage/Gepetto https://github.com/idapython/src/blob/6142005952c489e684ad3aa8870b55b73baac90a/examples/hexrays/vds3.py#L87
我看到你图片上的效果（开心坏了），我试着改了下，把调用chatgpt的部分改成了我自己的接口，但是失败了，现在发现BaseTool子类获取不到需要去查询GPT的内容，不知道是不是我对IDA插件不够熟悉的原因，还是因为其他原因，始终找不到原因。

这是我创建BaseTool类的代码

class BaseTool(abc.ABC):
    @abc.abstractmethod
    def _run(self, query: str, run_manager=None):
        pass

    async def get_gpt_desc(self, prompt):
        global default_prompt
        logger.info("QUERY: {0}".format(prompt))
        headers, query = {'Content-Type': 'application/json'}, default_prompt.format(binary_description=prompt)
        text, msg = '{"text":""}', json.dumps({"msg": query}, ensure_ascii=False).encode('utf-8')
        async with aiohttp.ClientSession() as session:
            try:
                async with session.post("http://10.X.X.X/api/chat", headers=headers, data=msg, timeout=30) as resp:
                    text = await resp.text()
                    # 返回异步纯文本GPT结果
                    return json.loads(text)["text"]
            except Exception as e:
                logger.error("Error session.post : {0} {1} ".format(msg, e))
                raise Exception(e)

    async def run(self, prompt: str, run_manager=None):

        try:
            logger.info(f'启动 Copilot -> {self} -> run')
            logger.info("Query：" + prompt)
            # query = await self.get_gpt_desc(query)
            return self._run(prompt, run_manager)
        except Exception as e:
            traceback.print_exc()
            return f"Error: {str(e)}"

这是 Copilot 类代码

class Copilot:
     # 循环调用查询GPT
    async def agent_run(self, agents, prompt):
        for index in range(len(agents)):
            if index == 0: prompt = prompt
            agent = agents[index]
            # 在异步环境中调用异步方法
            prompt = await agent.run(prompt, None)

    def run(self):
        ea = idaapi.get_screen_ea()
        func_name = idaapi.get_func_name(ea)
        prompt = func_name or f"0x{ea:x}"

        agents = [
            self.__GetAddressInfoTool(),
            self.__GetDefinitionTool(),
            self.__GetPseudocodeTool(),
            self.__SetFunctionCommentTool(),
            self.__SetFunctionDefinitionTool(),
            self.__SetFunctionNameTool(),
            self.__GetIsMyWorkDoneTool(ea)
        ]

        # 使用异步来运行任务
        asyncio.run(self.agent_run(agents, prompt)) 

    class __GetAddressInfoTool(BaseTool):
        # 内容不变......

    class __GetDefinitionTool(BaseTool):
        # 内容不变......

    class __GetPseudocodeTool(BaseTool):
        # 内容不变......

    class __SetFunctionCommentTool(BaseTool):
        # 内容不变......

    class __SetFunctionDefinitionTool(BaseTool):
        # 内容不变......

    class __SetFunctionNameTool(BaseTool):
        # 内容不变......

    class __GetIsMyWorkDoneTool(BaseTool):
        # 内容不变......

还请大佬能给解惑

DearVa commented 7 months ago

BaseTool中不应该包含调用GPT的代码，因为BaseTool是由GPT来使用的。我采用的是LangChain框架，它的原理是在调用GPT的时候告诉他当前有哪些可用的BaseTool，之后GPT会自动选择合适的BaseTool来使用，以此循环，这被称作Agent。如果你希望更改GPT的调用方式，你需要更改ida_copilot/copilot.py的代码

class Copilot:
    def run(self, temperature=0.2, model='gpt-3.5-turbo-0613'):
        # ...

        agent = initialize_agent(
            agent_type=AgentType.OPENAI_MULTI_FUNCTIONS,
            llm=ChatOpenAI(temperature=temperature, model=model),  # <----- HERE
            tools=tools,
            # callback_manager=BaseCallbackManager(handlers=[
            #     CopilotPanelCallbackManager()]),
            verbose=True,
        )

        # ...

官方的教程在此处

DearVa commented 7 months ago

我有空会研究一下修改代码的功能，感谢指路

ver007 commented 7 months ago

哦我明白了 BaseTool 相当于给GPT开了一个调用接口，它需要什么，就会通过该接口去获取什么，是这样吗？

DearVa commented 7 months ago

你的理解没问题，BaseTool指定了工具的描述，在调用GPT时就会将描述格式化成prompt传给GPT，GPT就可以理解工具的用途和调用方式（如参数列表），并以JSON的形式输出。之后LangChain解析JSON，并调用对应的工具。

ver007 commented 7 months ago

感谢解惑！我想能不能在 Chinese-Llama-2-7b 或者 Mistral-7B-Instruct-v0.2 的本地运行上调用，这样虽然效果差点，但是省钱呐😄

DearVa commented 7 months ago

LangChain应该支持多种大模型（包括本地模型），你可以在官网上找到教程

ver007 commented 7 months ago

补张理解图 😄

DearVa commented 7 months ago

default_prompt也是要进行构建的，其模板大致如下：

[基础描述] [工具描述] [输出描述]

具体地，在本项目中可能为：

你是一个经验丰富的逆向工程师，使用IDA分析……

你可以使用下面的工具：工具1名称、描述、参数工具2名称、描述、参数工具3名称、描述、参数

在输出时，如果要使用工具，则使用JSON输出工具名和参数。确保你的输出可以被json.loads解析。

这个过程叫做提示工程（Prompt engineering）

ver007 commented 7 months ago

感谢解惑 😄

ver007 commented 5 months ago

朋友这个插件还更新吗想用 😄

ver007 commented 1 month ago

https://blog.csdn.net/qq_46106285/article/details/137430941 朋友这个方法可以使用本地的ollama 有没有办法给集成进去？

DearVa commented 1 month ago

感觉整个插件需要彻底重写，另加入RAG等功能

ver007 commented 1 month ago

https://dev.to/mohsin_rashid_13537f11a91/rag-with-ollama-1049 希望对你有帮助 😄

Antelcat / ida_copilot

朋友你好基于你这个插件能不能增加一个功能，就是对伪代码进行格式化 #2

Antelcat / ida_copilot

朋友你好 基于你这个插件能不能增加一个功能，就是对伪代码进行格式化 #2

朋友你好基于你这个插件能不能增加一个功能，就是对伪代码进行格式化 #2