Closed lawm closed 2 years ago
@lawm Your test branch works ok for me:
Thanks for trying it.
I did some more tests on new emulator instances, a real device, and different urls.
It happens with this combination: macOS with LuLu firewall enabled, running an Android emulator, and the URL is any local network device that is on the network, but doesn't have a web server running.
If I use a real Android 6.0 device, the issue doesn't happen. If I disable the LuLu firewall, the issue doesn't happen. On new emulator instances, I may need to try clicking JPG/MP4 a few times (not fast, just switching back and forth), and it will sometimes show the placeholder, but the JPG/MP4 click ripple animation is frozen for ~2 seconds. Using a URL to a Linux laptop's IP address, the issue also happens.
The emulator and firewall combination causes something (the app or some emulator resource) to hang.
I tried to use the Android Studio profiler.
One type of trace shows android.os.MessageQueue.nativePollOnce taking 6 seconds. DefaultDispatcher and OkHttp threads also take ~6 seconds in the same timeframe.
OkHttp Dispatcher Flame Chart:
Another type of trace just shows recomposing and drawing:
I want to try inserting a Thread.sleep() somewhere in the above OkHttp call path to simulate this without the firewall. But, even with that, I would still need to figure out how the OkHttp thread blocks the UI.
I can't figure it out and maybe it's a problem in the emulator, not coil. We can close this issue for now.
Description Coil is very useful. Thanks for creating and maintaining it.
I edited the sample app to use URLs like this: https://192.168.10.94:8000/nothing.jpg
192.168.10.94 is a valid host, but the web server is not running, meaning it is not accepting incoming connections to port 8000.
Here is the Timestamp: and Description of what happens in the video: 0 sec: I start the app 4 sec: dark blue placeholder colors are shown. This is expected. 10 sec: Red error color is shown, as configured. This is expected. 12 sec: I click "JPG" to switch to MP4 list. 14 sec: I click "MP4" to switch back to "JPG" list. The problem is here. The UI thread seems to be blocked and nothing new is drawn. The placeholder color is not shown, and the "JPG" list isn't drawn.
20 sec: It shows the JPG list with Red error Drawables.
If I use an invalid host, like 192.168.1.123, which doesn't exist on my network, the issue does not happen. I think it's because the network failure is returned faster in this case compared to when the host is valid and not accepting connections or returning packets.
Video:
https://user-images.githubusercontent.com/3174101/192216374-5a53fca2-a506-40ce-abd5-a1b48a71ef99.mov
Steps To Reproduce Checkout my branch or cherry-pick my change: https://github.com/lawm/coil/commits/bug-repro-hang-on-network-request Edit coil-sample-common/src/main/assets/jpgs.json. Change the urls to your PC's IP address, and don't run any web server. Start app, wait for red error color. Click "JPG" Click "MP4" Watch app hang for 4 seconds until red error color is drawn.
Log
Expand
``` ---------------------------- PROCESS STARTED (23814) for package coil.sample ---------------------------- 2022-09-25 23:12:16.866 23814-23838 OpenGLRenderer coil.sample D HWUI GL Pipeline 2022-09-25 23:12:17.132 23814-23819 zygote coil.sample I Do partial code cache collection, code=29KB, data=20KB 2022-09-25 23:12:17.134 23814-23819 zygote coil.sample I After code cache collection, code=29KB, data=20KB 2022-09-25 23:12:17.134 23814-23819 zygote coil.sample I Increasing code cache capacity to 128KB 2022-09-25 23:12:17.284 23814-23819 zygote coil.sample I Do partial code cache collection, code=59KB, data=45KB 2022-09-25 23:12:17.287 23814-23819 zygote coil.sample I After code cache collection, code=59KB, data=45KB 2022-09-25 23:12:17.287 23814-23819 zygote coil.sample I Increasing code cache capacity to 256KB 2022-09-25 23:12:17.376 23814-23838Version Latest coil main branch as of today.