mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
https://appagent-official.github.io/
MIT License
4.97k stars 538 forks source link

Code Pauses at cv2.waitKey(0): Waiting for Keyboard Input #5

Open doyer3112 opened 9 months ago

doyer3112 commented 9 months ago

When my code reaches the line cv2.waitKey(0), how am I supposed to input? I have pressed various keys on my computer keyboard and also performed corresponding actions on my phone, but the code doesn't seem to proceed. What is the correct execution logic supposed to be?

image

image

mnotgod96 commented 9 months ago

Switch to the window that shows the smartphone screenshot with labels, then press a random key on your keyboard.

You can also modify cv2.waitKey(0) to cv2.waitKey([time in milliseconds]) to control the duration of the appearance of the screenshot window.

doyer3112 commented 9 months ago

Thanks, your suggestion was very effective. I just need to enter anything on the computer keyboard when the screenshot window is in focus to move to the next step.