LaphaeL12304 / LaphaeL-aicmd

Enable ChatGPT/Gemini to execute commands on linux in multi-steps, allows using natural language to operate linux.
GNU Affero General Public License v3.0
50 stars 7 forks source link

[Problem] The method to judge if ai is ready #10

Open DataEraserC opened 5 months ago

DataEraserC commented 5 months ago

image https://github.com/LaphaeL12304/LaphaeL-aicmd/blob/8d8a501fc79fe4b4fc9b6932a1737a7a45fdb5d4/src/interact_AI.py#L158

maybe "OK" or "好的" would be better? or maybe we can using another method to judge?

LaphaeL12304 commented 5 months ago

I think maybe we should change the method of interacting with AI. I wrote this to interact once when program starts was because interacting with Gemini was done by using the .chat method in google.generativeai:

        # 初始化Gemini AI - Initialize Gemini AI
        genai.configure(api_key=self.api_key)
        self.genai_model = genai.GenerativeModel(self.model)

Which will automatically append the chat history, so the program must send the instruction prompt at first. But this may not be the best method, maybe we should change to something like this:

model = genai.GenerativeModel('gemini-pro')
messages = [
    {'role':'user',
     'parts': ["Briefly explain how a computer works to a young child."]}
]
response = model.generate_content(messages)
messages.append({'role':'model',
                 'parts':[response.text]})
messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})
response = model.generate_content(messages)
print(response.text)

This will allow me to directly set the instruction prompt.

In ChatGPT, the method of interacting is already similar to the second one.

By doing so, we can use a single sentence to judge the connection, e.g., "Please reply 'ready'"; or even remove the confirmation, but to run exception codes when not connection failed.


By the way, I'm now adding a feature that if some configs are empty,(such as API key or selected model), then the program will ask the user to enter such config in the CLI.

In addition, I'm considering if we can put AI interaction, command execution and main program into different threads, so that the user input can be detected while command is executing; and AI can monitor the execution every set period of time, to judge to stop execution or not, rather than using fixed timeout?