Genymobile / scrcpy

Display and control your Android device
Apache License 2.0
111.4k stars 10.67k forks source link

Programmatically Send Keypress Through OTG #4134

Open whinee opened 1 year ago

whinee commented 1 year ago

Good day! Firstly, I would like to apologize for this lengthy feature request.

Is your feature request related to a problem? Please describe. Hello! I got myself a "new" phone which is a hand-me-down from my mom. I have been experimenting with it for about two days on how to debloat the whole thing with as little friction as possible, such as that I don't need to root the device and the likes. And so far, it works! Except, my script actually removes some vital components in the android system, and I need to be able to reformat the phone so that it reverts back to its initial state, and to remove the thingy that removes the important thingy. But doing that is tedious! First, I have to reformat the phone, then have to go through the init of the phone. Which basically means signing in on google, getting thru hoops and loops of EULA and the shitty "install these apps".

Now, I have been using this program to be able to control my phone without using my sweaty hands to do so. And for the past 3 years, it had worked flawlessly! And lately, I just discovered OTG mode, and it is actually super useful for devices that does not have USB debugging enabled, and which screens I can't control due to it being broken. It had actually saved me multiple times now!

And now, we're back to the present day, in this predicament at hand. I have combed the internet for answers on "how to use my linux machine as a keyboard for a device" or some variant of the question ("linux emulate usb slave" and the likes). Which, as a dumb person, thought would be actually out there, as I have been clearly not the first person to thought of this. But as you would have guessed, I'm just a lowly python programmer that so happens to really like use arch linux as a form of sadomasochism.

Yes, I've found a solution, but I do not know how to implement just that. It's like an apple pie recipe, but instead, I got pointed to a lengthy autobiography of a wheat farmer that in 176 pages tells you how the wheat is turned to flour. But that's not what I want!

And thus, we have come to this. Can I request to have a feature where I can programmatically send keypress through the OTG?

Describe the solution you'd like Basically, I want to be able to have an API where I can either send key codes or an ASCII character to, and have it sent to the device thru OTG. Other proposal would be to have a CLI command that takes in a key code or ASCII character, then do the same thing.

Describe alternatives you've considered I have thought of using a raspberry pi zero or raspberry pi pico, but I have steered away from it due to the complexity it would add to my already complicated workflow. But now, it seems that either I have to do it that way, or have this feature request added in. Nay, I have one more option: to learn C. But this was supposed to be a quick and dirty 7 day project. But alas, as my future self would find out, it would not be!

nnnpa31 commented 1 year ago

It looks like you need development advice on OTG mode, not feature requests. For reference, have you seen #279 yet?

nnnpa31 commented 1 year ago

Actually, I have already done all the work related to OTG mode using both C and Go languages separately (about a few days ago). However, when I tried to port it to Python, I hit a roadblock with the "flexible" API of pyusb. Eventually, I gave up and I'm sorry I couldn't help you with that. But I can still provide some development tips for you (not starting with planting wheat), at a later time :)

whinee commented 1 year ago

@nnnpa31

It looks like you need development advice on OTG mode, not feature requests.

I think I actually do, yeah! But, at the same time, I still want to see this feature implemented on scrcpy.

For reference, have you seen https://github.com/Genymobile/scrcpy/issues/279 yet?

Ooh, not yet. Let me skim through it sometime.

Actually, I have already done all the work related to OTG mode using both C and Go languages separately (about a few days ago).

Ooh, nice! Can you publicized those code, or is that for personal user only?

However, when I tried to port it to Python, I hit a roadblock with the "flexible" API of pyusb.

Okay, so, for context, I've actually saw rom1v/aoa-hid-bug. And so, I asked ChatGPT to explain stuff for me, and I modified it accordingly to my needs. And it actually worked! It was able to send a single character to my phone. But after trying it for the second time, it spits out a pipe error, which I think is from improper closing of the device or whatevs (you have to remember that I still do not know anything about this, alright). But as a PoC for what I am trying to do, it works!

I have asked ChatGPT to fixed the error, and it didn't helped. Either I was doing something wrong, or something was definitely wrong. And in this case, the former is the more probable cause. In any case, I have actually sent my ramblings to this one discord server, and an old friend of mine said to check out python HID libraries. And one of them is pyusb. And indeed, skimming it first gave me an idea of how flexible it is. And I also do not like how there is no one right way to do stuff.

So yeah, understandable.

Eventually, I gave up and I'm sorry I couldn't help you with that.

Oh, that's fine. Even nudging me to the right direction helped me!

But I can still provide some development tips for you (not starting with planting wheat), at a later time :)

Ooh, that is much appreciated. Thank you! In any case, can I get your contact details? Hereunder my contacts wherein we could discuss further:

nnnpa31 commented 1 year ago

https://github.com/Tryanks/go-aoahid This is my work on programmable OTG control :)

filipef101 commented 1 year ago

@nnnpa31 since you are familiar with it, do you think it is possible to OTG control an iPhone? https://github.com/Genymobile/scrcpy/issues/4341 https://github.com/Genymobile/scrcpy/issues/279#issuecomment-1278055308

Tryanks commented 1 year ago

@nnnpa31 since you are familiar with it, do you think it is possible to OTG control an iPhone? #4341 #279 (comment)

To clarify, the topic here is not related to the scrcpy project, so I'll just provide some clues as to what I know: I used a Lightning hub to connect to an ios device, and a CH9329 chip on the computer side to emulate a real keyboard and mouse device via serial. (This is not perfect, because it leads us to use only the "relative coordinates" of the mouse, and we can't locate where the mouse pointer is on the screen.) Another possible solution for ios is XCTests: use go-ios to do whatever you want with the device; use WebdriverAgent to emit "helper" like interaction events to the device (clicking, swiping, dragging, etc. on UI elements); use quicktime_video_hack to transfer screen shots from ios devices.

serbyxp commented 10 months ago

@nnnpa31 since you are familiar with it, do you think it is possible to OTG control an iPhone? #4341 #279 (comment)

To clarify, the topic here is not related to the scrcpy project, so I'll just provide some clues as to what I know: I used a Lightning hub to connect to an ios device, and a CH9329 chip on the computer side to emulate a real keyboard and mouse device via serial. (This is not perfect, because it leads us to use only the "relative coordinates" of the mouse, and we can't locate where the mouse pointer is on the screen.) Another possible solution for ios is XCTests: use go-ios to do whatever you want with the device; use WebdriverAgent to emit "helper" like interaction events to the device (clicking, swiping, dragging, etc. on UI elements); use quicktime_video_hack to transfer screen shots from ios devices.

@Tryanks

it’s possible to do qvh (access it’s tcp endpoint ) and do HID at the same time? Like I’m trying to figure out this also, and where I’m stuck is if qvh needs the usb to be in host mode then to be an HID device you got to be in client or otg mode (I think). But from what I have found the iOS device ( is just like a “server” / controller that turns on (opens) tcp endpoints that the host can then connect to, if this is the case (my question), when connecting an HID device to iOS the connected device (the pc) doesn’t need to be in OTG or client mode ? It can be in host mode? And then just ask the iOS device to open up the HID API ? ( this is where I get confused). Because I know that the chip in the cable or adapter tells the iOS Lightning chip what type of device it is and does whatever security handshake with the co processor before it opens up the endpoint… but if this usbmuxd or whatever is being used to connect to the iOS Lightning port through usb can already just tell it to open specific API endpoints, and it can connect to them. Then wouldn’t just having the usb payload that specific to USB-HID open up that endpoint ? Then a usb host device (pc) can connect to that and write / emulate a “client” or otg payload to that endpoint? Which would allow the qvh ( the api endpoint for the video stream) to be active also ? Assuming there is enough bandwidth on the bus (usb cable) the usbmuxd could send emulated HID and receive qvh video stream?

because from what I’m understanding if you connect it to a lightning to usb hub adapter you can get hdmi and send HID, so is it the chip inside the adapter that’s muxing that? I assume it’s a usb hub chip of some sort, but if the hub chip can take HID in and USB ( or w/e lightning to usb /hdmi) out and combine it all to 1 Lightning, isn’t as straight forward as a normal usb hub, because it still needs to negotiate with the iOS device to open the API endpoints ? So if that’s the case then the type of usb is always a host doesn’t have to be otg or client for hid..?🤷‍♂️ it just needs to be able to tell (send proper payload , security handshake etc) / send it (and route it ) any payload it wants as a host to the tcp/ api endpoint port the iOS device opened for it?… which includes payloads for HID inputs as per the usb spec.

So my confusion and where I’m stuck before attempting to write a script for this… is knowing if all communication to an iOS device is controlled that way, then it’s just a matter of routing the payloads it expects in the writes to the proper endpoints which doesn’t or wouldn’t need a hub (since we are emulating key strokes or taps)

a lot there but any info on that would help me understand this usbmuxd thing better to even know where to start

thanks