dryark / stf_ios_support

Central repo to connect and document components/repos needed for IOS stf support
Other
153 stars 65 forks source link

Stability on long run #85

Open mbilbiesi opened 3 years ago

mbilbiesi commented 3 years ago

Hello :)

After successfully running this app with two devices simultaneously; I have an issue with stability on long run ! The two iOS devices are disconnected and transitioned to preparing state as you can notice in the screenshot below:

2020-11-12_15h09_16

and i am attaching below coordinator logs from state after running the app !

time="2020-11-12T12:18:51+03:00" level=info msg="Device Owner Start" owner=administrator@fakedomain.com proc=stf_device_ios type=wda_owner_start uuid="***802E"
time="2020-11-12T12:19:12+03:00" level=info msg="Device Owner Stop" owner=administrator@fakedomain.com proc=stf_device_ios type=wda_owner_stop uuid="***802E"
time="2020-11-12T12:31:40+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T12:31:40+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T12:33:20+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T12:33:20+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T12:38:11+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T12:38:11+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T13:16:42+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T13:16:42+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T14:13:02+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T14:13:02+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T15:25:54+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T15:25:54+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T15:52:18+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T15:52:18+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T16:18:09+03:00" level=warning msg="Process end - wdaproxy" proc=wdaproxy type=proc_end uuid="***802E"
time="2020-11-12T16:18:09+03:00" level=info msg="Process start - wdaproxy" binary=../wdaproxy iosVersion=13.3 proc=wdaproxy type=proc_start uuid="***802E" wdaPort=8100
time="2020-11-12T16:18:10+03:00" level=warning msg="Process end - wdaproxy" proc=wdaproxy type=proc_end uuid="***002E"
time="2020-11-12T16:18:10+03:00" level=info msg="Process start - wdaproxy" binary=../wdaproxy iosVersion=12.2 proc=wdaproxy type=proc_start uuid="***002E" wdaPort=8101
Status response: {"value":{"build":{"productBundleIdentifier":"com.facebook.WebDriverAgentRunner","time":"Nov 12 2020 02:24:04"},"device":{"name":"iPhone 11 Pro Max","udid":"00500910-0MM587512132802E"},"ios":{"ip":"172.31.104.153"},"message":"WebDriverAgent is ready to accept commands","os":{"name":"iOS","sdkVersion":"13.4","testmanagerdVersion":28,"version":"13.3"},"ready":"true","state":"success"},"status":0}
time="2020-11-12T16:18:19+03:00" level=info msg="Fetched WDA session" id=7122B3CA-6C5D-4EBC-A2EE-D8F272921661 type=wda_session uuid="***802E"
window size response: {
  "value" : {
    "width" : 414,
    "height" : 896
  },
  "sessionId" : "7122B3CA-6C5D-4EBC-A2EE-D8F272921661"
}
time="2020-11-12T16:18:19+03:00" level=info msg="Fetched device screen dimensions" height=896 type=device_dimensions uuid="***802E" width=414
Status response: {"value":{"build":{"productBundleIdentifier":"com.facebook.WebDriverAgentRunner","time":"Nov 12 2020 02:24:04"},"device":{"name":"iPhone XS Max","udid":"00070120-001CFM221349002E"},"ios":{"ip":"172.31.104.133"},"message":"WebDriverAgent is ready to accept commands","os":{"name":"iOS","sdkVersion":"13.4","testmanagerdVersion":26,"version":"12.2"},"ready":"true","state":"success"},"status":0}
time="2020-11-12T16:18:22+03:00" level=info msg="Fetched WDA session" id=A9C4A276-7BBB-4E23-AB3B-5644F9FE46C6 type=wda_session uuid="***002E"
window size response: {
time="2020-11-12T16:18:22+03:00" level=info msg="Fetched device screen dimensions" height=896 type=device_dimensions uuid="***002E" width=414
  "value" : {
    "width" : 414,
    "height" : 896
  },
  "sessionId" : "A9C4A276-7BBB-4E23-AB3B-5644F9FE46C6"
}
time="2020-11-12T16:28:27+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T16:28:27+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start
time="2020-11-12T17:06:21+03:00" level=warning msg="Process end - stf_ios_provider" proc=stf_ios_provider type=proc_end
time="2020-11-12T17:06:21+03:00" level=info msg="Process start - stf_ios_provider" binary=/usr/local/opt/node@12/bin/node client_hostname=mac-mini.local client_ip=172.31.111.201 location=macmini/mac-mini.local proc=stf_ios_provider server_hostname=172.31.105.119 server_ip=172.31.105.119 type=proc_start

Based on the log; i see that it is stf_ios_provider restarted beside wdaproxy ; and whenever this happens; the process will restart; I believe there is an issue somewhere while restarting the process again !

And i am trying to understand why the process keeps restarting !! I will keep looking into this behavior and post any findings! but if you also have any supportive suggestions on where i need to focus while am debugging ! it will be appreciated !

nanoscopic commented 3 years ago

WDA is designed to automatically restart every X amount of time. It is configured to every 4 hours by default. See https://github.com/DeviceFarmer/stf_ios_support/blob/master/coordinator/config.go#L208

stf_ios_provider shouldn't though also restart at that time. If it is restarting also that is a bug triggered by the automatic WDA restart.

The code that handles the automatic restart is here: https://github.com/DeviceFarmer/stf_ios_support/blob/master/coordinator/periodic.go As can be seen in the code, if you configure the restart value to be 0 it will never restart. This causes problems though as WDA internally has memory leaks and it grows in size till it will crush your machine from memory use. That is why there is an automatic restart in the first place.

Despite stf_ios_provider should not restart by itself, I don't see that it should hurt anything as long as it only happens when the automatic restart of WDA occurs.

The problem though as you say is when it gets stuck in "preparing" mode. I've seen this happen because of the dependency on video streaming starting up correctly. The video is just too finicky and sometimes will not start up. There are various ways to "unstick" the video and get it working again but they don't happen automatically.

Ultimately the fix for this is to use the new app based "upload broadcast extension" video streaming I have created. I have not yet released it though onto the app store as I've been working through how to set it up as a service...

You might look into using the previous video mode based on quicktime_video_hack. Daniel Paulus says that he has fixed the issues with that, and it it more reliable than the default video method using AVFoundation. I haven't updated that method but it is still available as an option. If you set 'videoMethod' ( see here: https://github.com/DeviceFarmer/stf_ios_support/blob/master/coordinator/config.go#L162 ) to "ivp" instead of "avfoundation" the current code should still function the old way. The bug I was seeing though was that some phones simply don't work with the "ivp" method. If your phones do work with it, that method is more reliable ( for the most part ) than "avfoundation" is.

Finally, the whole "stuck in preparing" is a "feature" of how STF itself works. It is somewhat garbage because it doesn't explain what is going on or why it is stuck. This is one of the various things I will fix entirely in my new closed source rewrite.

mbilbiesi commented 3 years ago

Awesome! I tried before you replied; to increase the restart time for WDA and make it 24 hours; and it works with me so far ! although i didn't know how to relate this with the issue !

stf_ios_provider should though also restart at that time. If it is restarting also that is a bug triggered by the automatic WDA restart.

Do you mean that when WDA is restarting; we need also to make sure stf_ios_provider is restarting as well?

Ultimately the fix for this is to use the new app based "upload broadcast extension" video streaming I have created. I have not yet released it though onto the app store as I've been working through how to set it up as a service...

waiting for your solution which will reduce the pain of the current solution issues!

I will try also the ivp method and will keep posting my findings ! Thanks for your hints!

nanoscopic commented 3 years ago

Was supposed to say "stf_ios_provider shouldn't restart at that time" I've updated the post above to correct this. stf_ios_provider should stay running and continue to work.

Will keep you posted on the app. I have it working currently, I just have two issues I need to resolve before I can release it:

  1. I want to profit from it, so my intent is to make the app free, but tied to a service I will create. The service ultimately will become a full solution that replaces STF. I want others to also be able to just run it on-prem themselves also, but my plan is to charge for it. As a result it is somewhat in conflict with the open source nature of STF, so I am trying to figure out a path to let the components I build be used at some cost by users together with the open source STF. I still want to support the open source portion and make it usable without any closed source components or paid bits... so I am trying to figure out how to balance it all for fairness.

  2. The app needs to have a UI to go onto the Apple store. Without a UI I don't think it would be approved. It works without a UI though... Just haven't gotten around to doing this yet. I considered releasing the app binary only temporarily just by distributing the ipa, but I think that would make things confusing for the future so I'm avoiding it.

Besides the app I do still intend to support the other video methods in my spare time.

For anyone who wants a fully open source app based video streaming, I'll also just describe here how to do it. If some enterprising individual is willing to write it and open source it I'm happy to add it in as an option to stf_ios_support. It really wasn't that hard to make. I'm just not willing to release my implementation open source... because... reasons...

Here is what one has to do to make the app:

  1. Follow replaykit2 basic instructions from Apple for creating a demo "upload broadcast extension" app. Replaykit2 requires at least IOS 13, so this method will not work with IOS 12 and below. If you want those you'll need to make a replaykit version also which is more hassle. ( and pointless imo )

  2. In the place where frames are received and action can be taken on them, convert those frames to jpeg and then send them via nanomsg to ios_video_stream.

  3. Do frame skipping because full frame rate is too much and also not needed.

  4. Do the jpeg conversion optimally using hardware based conversion. It doesn't take a lot of code to do this, but how to do it is not immediately obvious and requires some research.

  5. Downscale some; and do that via hardware acceleration also, to save bandwidth.

  6. Detect duplicate frames to avoiding sending all frames.

  7. ( bonus hint ) Multithread your solution and use queues. Be careful not to use too much memory as it will crash. Upload broadcast extension mechanism is very finicky and doesn't provide any easy ways to debug that I've found.

  8. ( bonus hint ) Use the "file sharing" feature of IOS to make the app configurable via a JSON config file.

So... someone else please write the above as open source to assist the community. Or don't and buy my rendition of it when I release it. 🤷

mbilbiesi commented 3 years ago

After WDA update to restart after 24h and watching the behavior, here is my findings:

The case as i observed Although it is not expected from stf_ios_provider to restart but restarting the provider is not hanging the app; but the problem is when the provider starts after the new WDA process, this will cause the issue !

will keep trying to trace and debug the case ! this is just a heads up

mbilbiesi commented 3 years ago

I ran the following command to kill node processes while it is hang pkill -f node (without killing stf_ios_support , in the state when is it stuck in preparing) and i noticed stf_ios_provider restarted again and it works and devices are in USE state again instead of PREPAIRING

nanoscopic commented 3 years ago

When it is stuck in the "preparing" state that is due to the device unit being "stuck" more so than the stf_ios_provider.

Killing node processes will kill off both STF provider unit and STF IOS device unit. ( and they then in turn will be automatically started again by the coordinator )

You can also restart the device unit through the web interface on port 8027: https://github.com/DeviceFarmer/stf_ios_support/blob/master/coordinator/http_server.go

The web interface provides a button to restart it.

As a workaround / hack, it is possible to detect via API calls to STF when a device is stuck in the preparing state and then programmatically call the web restart REST call for the device unit. I considered doing this already but did not because I was hoping to discover what causes it to get stuck in the preparing state to begin with.

mbilbiesi commented 3 years ago

you are correct; i found actually a workaround for this by restarting ios device unit after generating a new WDA session by adding the following block of code in coordinator after line 746 https://github.com/DeviceFarmer/stf_ios_support/blob/master/coordinator/coordinator.go#L746

 if(devd.process["stf_device_ios"] != nil) {
       restart_device_unit(devd)
 }

and it works! though i still don't know what is the root cause; seems ios device unit is connected somehow to the stale WDA session !

uwtechexpert commented 3 years ago

@mbilbiesi Are you able to connect 2 ios devices and stream them at a time?

mbilbiesi commented 3 years ago

@uwtechexpert Yes; with the workaround I mentioned in the above comment https://github.com/DeviceFarmer/stf_ios_support/issues/85#issuecomment-728629969 ; it is now working with me perfectly and have no issue since 1 month :)

illpe commented 3 years ago

@mbilbiesi Hi, can I ask for a bit help from you? Can you provide a mini guide on how do you run ios and android devices on the same machine? If you have some time, contact to me, please razdvaip@gmail.com

nanoscopic commented 3 years ago

@illpe Asking for help on learning how to use STF isn't really what this thread or this repo is for.

I'd appreciate if you didn't "at" commenters on issues here who you think may be able to solve your problems for you.

In regard to your question of "how do you run ios and android devices on stf", the answer is:

  1. Follow upstream STF documentation on how to run STF with Android devices
  2. Use this repo to connect your IOS devices to that STF instance.

The one main gotcha is that you must use https for your STF instance. I don't support using plain http with STF as the upstream guides tend to recommend.

You can use my repo "stf-android-provider" to start an android provider to point at the STF instance that can be created using the "server/" folder in this repo.

Please create a specific issue for your ask. I don't mind if you create issues that don't belong here. What I mainly mind is hijacking random issues and/or pestering other users of the repo.

boynuristyana commented 2 years ago

you are correct; i found actually a workaround for this by restarting ios device unit after generating a new WDA session by adding the following block of code in coordinator after line 746 https://github.com/DeviceFarmer/stf_ios_support/blob/master/coordinator/coordinator.go#L746

 if(devd.process["stf_device_ios"] != nil) {
       restart_device_unit(devd)
 }

and it works! though i still don't know what is the root cause; seems ios device unit is connected somehow to the stale WDA session !

@mbilbiesi could you please to update the correct line for adding your block of code above? I've been tried adding your block of code according to the line you mentioned in coordinator.go but stf_ios_provider still restart right after wda restart make device preparing status in STF dashboard. also tried by adding your code in some line after WDA get the new session and still the same.

Updated! restarting ios device unit by adding block of code from @mbilbiesi above in Periodic after do restart line here https://github.com/DeviceFarmer/stf_ios_support/blob/12aabb5d3ac0b9b6fce56de4a4c7368bff695f27/coordinator/periodic.go#L35

Thank you.