Scirra / Construct-bugs

Public bug report submissions for Construct 3 and Construct Animate. Please read the guidelines then click the 'Issues' tab to get started.
https://www.construct.net
104 stars 83 forks source link

ANR rate at 4%, Slow cold start at 96% after Construct update #6166

Closed KekGames closed 1 year ago

KekGames commented 1 year ago

Problem description

Since I switched from Stable r302 to Beta r307 (and onwards) ANR rate in my game went up from 0% to 4%, Slow cold start went from 8% to 96%. Those are huge issues that affect game's visibility on Google Play with bad behaviour threshold for ANR being just 0.47%. Most frequent ANR is "Native method - android.os.MessageQueue.nativePollOnce" which may be related to Mobile Advert plugin. I assume this is related to changing Android target API level to 31 and, maybe, a broken update for Mobile Advert plugin if there's been one? Perhaps, Slow cold start may be related to changes with Splash screen? Here's the topic on forum Here are screenshots highlighting the issue for me and another user Untitled-2 Untitled-5

Reverting to r302 solves the issue.

Attach a .c3p

https://drive.google.com/file/d/1iouUXiFMqkWpRZa4i7vHA4nJUQOc30kV/view?usp=sharing ## Steps to reproduce
  1. Add publisher id for Mobile Advert
  2. Build with r302 and test
  3. Build with r307 (or newer) and test I'm not sure this will be helpful, as it may depend on a number of other factors and be reproducable only on some devices.

    Observed result

ANR rate went up significantly after updating from r302 to r307 ## Expected result ANR rate at the same level as before update. Maybe you could revert some changes until you fix this? There's still some time until November when target API level 31 is required for new updates. Besides, as igortyhon said on forum "Now I'm back to version r302 again and I'm building it myself through Cordova and setting the target version of sdk 31", which means that target version can still be 31 without causing the ANRs as long as the game was built with r302. ## More details

Affected browsers/platforms: Android

First affected release: worked in r302 but broke in r307 (or maybe r303, but I didn't try that)

igortyhon commented 1 year ago

I support. -r302 works fine. I build via cordova and set the target version of sdk to 31. -r308 I do the same. But I get a lot of ANR errors and a slow cold start.

AshleyScirra commented 1 year ago

The only recent change we have made to the advert plugin was updating the underlying Cordova plugin admob-plus-cordova from v1.25.0 to v1.28.0 in r306. This was done because apparently without the update, it crashes on all Android 12 devices (#5996). StatCounter shows about 28% of Android devices are already on Android 12, and that's going up steadily. So it seems that downgrading to the older version would cause an even worse problem.

As for any issues in v1.28.0, unfortunately this is difficult as we don't develop that plugin. The changelog for the plugin does not appear to indicate any major changes other than updating Google's libraries. I don't see anything that looks related in their issues list, but reporting this to them would probably be more likely to get results as they are the ones to develop the Cordova plugin.

The other possibility is it's a bug in Google's libraries, or a bug in the AdMob service itself. Both are definitely possible. We've already had to work around bugs like AdMob seeming to randomly fail to initialise, which caused us to implement a timeout on initialisation to try to work around it. I think that is currently set to 5 seconds; if AdMob doesn't respond in that time the app carries on with ads disabled on the assumption AdMob failed to load due to an AdMob bug; perhaps that contributes to the slow start time. Either way, if that is the problem, you would need to contact Google or AdMob for support. @DiegoScirra - any further thoughts?

KekGames commented 1 year ago

Strange, I didn't notice any crash reports or feedback from players that would indicate that it crashes on all Android 12 devices. You've mentioned 5 second initialization work-around, did you add this recently (between r302 and r307)? Because the slow start issue appeared when I switched to r307. Besides, why would it affect almost 100% of users. Here's report on google's issuetracker for this ANR. Maybe this comment is relevant (but it isn't about ads, as I understand?):

After examining our code carefully we noticed that this anr is caused mainly when by mistake the app tries to show a dialog but the call is not from the main thread. On most cases and on most devices, the dialog is just not showing up. BUT... on some devices and especially Huawei and Xiaomi, this results on blocking the onbackpressed Dispatcher. In our case those ANRs happened almost completely when the user pressed the back button. There is no error defining the bad call from out of the UI thread, or any other log to help to address the problem. So our conclusion is to check if all calls to the UI are made from within the UI thread. We have published an update with this fix and until now no ANRs are logged.

fredriksthlm commented 1 year ago

Google is aware of the ANRs, it has been discussed in hundreds of messages on Admob forum. They released this https://developers.google.com/admob/android/optimize-initialization , which seems to work for me. So either build with Android Studio and add those lines, or wait until ratson updates his plugin to target v21+ and hopefully add those by default. in due time Google will turn this on by defalt if/when it passes beta phase

KekGames commented 1 year ago

@fredriksthlm It didn't help in my case, unfortunately, ANRs persist. Unless I did it wrong. Untitled-6

fredriksthlm commented 1 year ago

Which version of GMA sdk do you use? you must use minimum 21.

KekGames commented 1 year ago

@fredriksthlm Hmm, I just export Android Studio project in C3 with Mobile Advert plugin, so whatever version that is. Are there extra steps required?

AshleyScirra commented 1 year ago

You've mentioned 5 second initialization work-around, did you add this recently (between r302 and r307)?

No, this has been in place for several months, and is definitely in r302. It was originally added after #5211 where it appears that due to an AdMob bug, it sometimes seems to never finish starting up.

Maybe this comment is relevant (but it isn't about ads, as I understand?):

I don't know, but even if that comment is relevant, it would apply to either the Cordova plugin or Google's libraries, not Construct.

My best guess from looking at all this information is AdMob itself is sometimes very slow to start or hangs completely. I think the only resolution is for Google to publish an update for AdMob that fixes these problems, then admob-plus-cordova is updated, then we update Construct to use that. I don't think there is anything else we can do from Construct's side.

I'll leave this open in case @DiegoScirra has any more comments but this issue will probably be closed as there is no action we can take, other than updating when possible.

KekGames commented 1 year ago

Hi @AshleyScirra , I tried applying a fix suggested by fredriksthlm but it didn't help. I also tried going back to r303 - and it actually started to crash on start-up with the message from https://github.com/Scirra/Construct-3-bugs/issues/5996 . I fixed the crash by adding: implementation 'androidx.work:work-runtime-ktx:2.7.0-alpha05' in build.gradle in Android Studio. But using r303 still didn't solve my issues. Since you're saying that you've updated the ad plugin in r306 - maybe it isn't what causes these problems? Maybe it was something in r303? After all, igortyhon went back to r302 - and it solved the issue with ANRs and Slow cold starts for him. Which brings me to my next question - is it possible to convert my project from r303 to r302? I was able to convert from r312 to r303, but not r302. When I try with r302 I get the usual

Check it is a valid Construct 3 folder project

Error report Error report information Type: unhandled rejection Reason: Error: invalid export format @ Error: invalid export format at d.Qsa (https://editor.construct.net/r302/projectResources.js:704:517) at d.Ga (https://editor.construct.net/r302/projectResources.js:720:6) at d.Qia (https://editor.construct.net/r302/projectResources.js:721:6) at d.Iwc (https://editor.construct.net/r302/projectResources.js:696:159) at d.Pia (https://editor.construct.net/r302/projectResources.js:679:493) at d.Nwc (https://editor.construct.net/r302/projectResources.js:649:80) at async Promise.all (index 0) at async d.BRb (https://editor.construct.net/r302/projectResources.js:653:97) at async Promise.all (index 7) Stack: Error: invalid export format at d.Qsa (https://editor.construct.net/r302/projectResources.js:704:517) at d.Ga (https://editor.construct.net/r302/projectResources.js:720:6) at d.Qia (https://editor.construct.net/r302/projectResources.js:721:6) at d.Iwc (https://editor.construct.net/r302/projectResources.js:696:159) at d.Pia (https://editor.construct.net/r302/projectResources.js:679:493) at d.Nwc (https://editor.construct.net/r302/projectResources.js:649:80) at async Promise.all (index 0) at async d.BRb (https://editor.construct.net/r302/projectResources.js:653:97) at async Promise.all (index 7) Construct version: r302 URL: https://editor.construct.net/r302/ Date: Sun Oct 09 2022 21:05:06 GMT+0300 (за східноєвропейським літнім часом) Uptime: 27.8 s Platform information Product: Construct 3 r302 (stable) Browser: Chrome 105.0.5195.127 Browser engine: Chromium Context: browser Operating system: Windows 10 Device type: desktop Device pixel ratio: 1 Logical CPU cores: 8 Approx. device memory: 8 GB User agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36 Language setting: en-US WebGL information Version string: WebGL 2.0 (OpenGL ES 3.0 Chromium) Numeric version: 2 Supports NPOT textures: yes Supports GPU profiling: yes Supports highp precision: yes Vendor: Google Inc. (NVIDIA) Renderer: ANGLE (NVIDIA, NVIDIA GeForce GTX 1070 Ti Direct3D11 vs_5_0 ps_5_0, D3D11) Major performance caveat: no Maximum texture size: 16384 Point size range: 1 to 1024 Extensions: EXT_color_buffer_float, EXT_color_buffer_half_float, EXT_disjoint_timer_query_webgl2, EXT_float_blend, EXT_texture_compression_bptc, EXT_texture_compression_rgtc, EXT_texture_filter_anisotropic, EXT_texture_norm16, KHR_parallel_shader_compile, OES_draw_buffers_indexed, OES_texture_float_linear, WEBGL_compressed_texture_s3tc, WEBGL_compressed_texture_s3tc_srgb, WEBGL_debug_renderer_info, WEBGL_debug_shaders, WEBGL_lose_context, WEBGL_multi_draw, OVR_multiview2
AshleyScirra commented 1 year ago

It sounds like this issue only affects people using Mobile Advert. Has anyone had this issue who is not using Mobile Advert? If not I think it's likely it's specific to Mobile Advert.

The next Mobile Advert change prior to r306 was in r291, and went out in a stable release in May. So the implementation of Mobile Advert has been identical from r291-r305, and the only change in r306 was updating the version of the underlying libraries we use. So at the moment I'm sceptical it has any other cause.

KekGames commented 1 year ago

@AshleyScirra There probably aren't many people who publish on GP and don't use Mobile Advert. Most likely the plugin is what unltimately causes the issue. But I don't get, why would it affect r303 and any later release but not r302? Maybe it's not because of the plugin update, but rather a combination of changes in r303 with the ads (regardless of plugin version)? Anyway, if you could help me convert my game from r303 to r302 - I'd really appreciate that, plus I would check if this solves the ANRs and slow cold starts. If you can do this - please give me an email to send my project to, or let me know if you can't.

igortyhon commented 1 year ago

@KekGames Hello. Try not to change the version, but leave Construct R303, but when assembling the billing SDK, set 3.0.3 instead of 4.0.0. Until November, this version can still be used. Write me your telegram, I will explain in more detail. I see that you are releasing a lot of updates on your stickmen and probably not happy with the result.

KekGames commented 1 year ago

@igortyhon thanks for suggestion, I'll write you an email on your contact from Google Play :) upd. r303 uses billing library 3.0.3, and ANRs still occur, so it's not an issue. @AshleyScirra btw, igortyhon is using a mediation, not the Mobile Advert plugin. So when he switched to r308 and encountered this issue, the ad plugin version didn't change for him. And when he went back to r302 - the plugin didn't change either, but ANRs and Slow cold starts went down.

KekGames commented 1 year ago

@AshleyScirra I found the cause of ANRs and Slow cold starts. It's the new splash screen. I commented out this line in CordovaActivity.java cordovaInterface.pluginManager.postMessage("setupSplashScreen", splashScreen); And this fixed the issue. The new Splash isn't a strict requirement from Google, just a recommendation. Can you go back to old Splash in the next update? Untitled-2

AshleyScirra commented 1 year ago

That code you referred to is part of Cordova, not Construct. It's something Cordova would change. I believe Cordova use an official Android support library for splash screens, so it may in fact be something Google themselves need to fix.

I believe the new splash screen is mandatory with the latest version of Cordova, and downgrading would mean dropping support for Android 12 entirely, which is not feasible given that it is now a requirement for publishing to the Google Play store.

I'd advise you to report the issue directly to Cordova here: https://github.com/apache/cordova-android/issues They are in the best position to decide what to do about this. As with our issues please be sure to include as much detail as possible to explain the problem to the Cordova developers. If they release an update which fixes the problem, we can update Construct accordingly.

I think this pretty much proves the issue is not related to Construct itself, and so fixing it is outside of our control. As we can't progress this issue directly ourselves, closing - let us know any issues you file with Cordova and we will follow those to monitor any changes from Cordova's side and update Construct if need be.

KekGames commented 1 year ago

@AshleyScirra I think, igortyhon mentioned that he made a Cordova build in r302 and manually changed the target API. Isn't this something Construct 3's build service could do - using older Cordova version but tartgeting newer API? Or you could just go back to older Cordova version for one beta release - Google allows sub API 31 updates until November. This would be a temporary workaround for me and other affected users, since it's impossible to convert a project from r303 to r302. Anyway, I reported this on Cordova https://github.com/apache/cordova-android/issues/1510

AshleyScirra commented 1 year ago

Older versions of Cordova don't support the latest API level. Maybe it works for some specific cases but there were lots of build errors and incompatibilities we had to fix for Android 12 this time around.

The Google Play store requirements already require Android 12 support for new app submissions - the November date is for app updates only. Going back a version would cause more damage than it fixes as anyone with a new app to publish will be blocked from publishing, which is tantamount to not supporting Android at all for those users. I think the only realistic way forward is to get Cordova (or Google) to fix this and update Construct when they publish a fixed version.

fredriksthlm commented 1 year ago

I can add a few details. Cordova now uses the official Google Android library for splashscreen: Core Splashcreen In Android studio you can verify if you use the latest version. Should be 1.0.0 (stable). From the start Cordova used the first alpha version (maybe beta). I can tough say that it has not been much changes, so I don't think this should make any differ. But should be verified.

Also, there has actually been an issue tracker for cold starts for splashscreen, where a fix is discussed how to correctly implement it: Issue tracker 230236390 How this is handled should also be for Cordova to investigate.

But the most interesting (and odd) thing is that I only have issues with long cold starts on one release, which was released in July. I made two other releases two weeks ago (other apps) where it is no problem at all with any cold starts at all.. For these I used the latest C3, latest Cordova, latest Admob, latest Splashscreen, built with latest Android Studio, and see no issues at all...

For the two latest releases I do not have Google Billing library, which I do for the release made in July.

KekGames commented 1 year ago

@fredriksthlm Surely the slow cold start also depends on the size of the game. Which could explain why it went up to around 100% for my game, "only" around 80% for igortyhon and is not an issue for some of your games. My guess is that the new splash screen just adds to the loading time, but not enough to cause the slow start for every game.

AshleyScirra commented 1 year ago

Has anyone seen any complaints from real users about long load times or hangs? It occurred to me that it's possible that the app measurements for cold starts or ANRs could be incorrectly counting the time spent displaying the new splash screen, in which case these cold start/ANR measurements could be false positives.

fredriksthlm commented 1 year ago

I have not seen any complaints, and the reviews coming in is on same level as before. So for my case your assumption seems very plausible. (That Googles own library (if correctly integrated) would give false positives in Googles own app analytics framework is though something they really need to look into if so..)

Comment from one user on the issue tracker: We were able to confirm, through Firebase Performance, that the startup time was reducing Faster startup time verified in Firebase However, we noticed that the cold and warm start metrics from Vitals were increasing a lot. Slower startup time for Cold starts verified in Vitals We ended up removing the integration of the lib and the metrics from Vitals returned to normal.

igortyhon commented 1 year ago

I also did not receive complaints from users about the long loading time.

KekGames commented 1 year ago

@AshleyScirra no such feedback for my game. Although when, for example, rewarded ads didn't work for 100% of users - few reported it anyway. So I'm not sure if the 4% of users who encountered an ANR once would care to report it. But what you're saying makes sense in this context: the ANRs went from 0% to 4%, and those 4% are not all the same report. The "Native method - android.os.MessageQueue.nativePollOnce" makes up only 60% of them. Maybe Google "catches" an ANR after X seconds of game loading in the background while splash screen appears, and reports whatever the game is trying to do at that moment as an "ANR". Which is also why, it appears, the bigger the game - the higher the ANR rate and slow cold start rate.

igortyhon commented 1 year ago

@fredriksthlm I use firebase analytics, but unfortunately I don't have a crashlytic built in. And I can't check if the vitas is correct.

fredriksthlm commented 1 year ago

If the splash is active for longer then 5seconds it is reported as "slow start", if the the splashscreen is visible for shorter than 5seconds there is no report of "slow start".. So it depends on how heavy the game is. Like Kek thinks. When the first draw call of the cordova activity is done, the start is verified in vitals. This is what I believe at the moment.. :)

KekGames commented 1 year ago

Hey @AshleyScirra will you guys add the potential fix suggested here https://github.com/apache/cordova-android/issues/1510 ?

AshleyScirra commented 1 year ago

What potential fix? I don't see anything relevant to Construct there.

FWIW, so long as your loader style is not "none", Construct hides the splash screen as soon as it can show a progress bar for loading. So it's not even loading the entire project, only enough of a few core files to be able to show the loading screen. I have no idea how that could take 5+ seconds. Unless you have a huge project using loader style "none"? In which case changing the loader style should work around the problem.

The root cause still appears to be Google's own splash screen library incorrectly counting the splash screen as the app not responding, so I think the real fix for this would be for Google to fix the splash screen library.

KekGames commented 1 year ago

The potential fix/workaround you mentioned yourself:

What should we do about it? Hard code the splash screen to hide after 1 second and just show some other kind of intermediate loading screen?

Setting the AutoHideSplashScreen to True reduces the ANR rate by 90%. It's not perfect, as it still gets above the allowed 0.47% on some days, but it's much better than the original 4%. Right now, If I set the splash to only show for 1s - there are still a few seconds of black screen before construct's progress bar appears (and it appears VERY briefly), does this black screen still count as a "draw" to prevent the ANR? Could showing an intermediate loading screen, as you suggested, right after the 1s splash screen change anything? You said:

I guess we could work around it, but it seems weird - are we supposed to reimplement the splash screen in the app code so we can keep showing the same thing without triggering an ANR?

and

It seems weird that we have to essentially reimplement the splash screen just to fix some metrics...

The metrics are vital and Google Play won't promote games that perform worse than the bad behaviour threshold which is 0.47% for ANRs. Of course, ideally, Google would need to fix this on their side. But realistically it takes forever for them to fix even more common issues and I didn't find this issue mentioned outside Construct community so far. Also, in this case they are likely to say that this is not a bug and that the splash screen should not be used for loading and shouldn't appear for any longer than 1s. P.S. my game's download size on Google Play is 19MB

AshleyScirra commented 1 year ago

I think from the user's point of view, showing a splash is better than several seconds of a blank screen. They could well close the app assuming it's broken if there's just a blank screen for too long. So I'd rather not make that a built-in part of Construct. I also don't want to have to reimplement a splash screen, which will be difficult to get exactly right, just to work around a bug in Google's code. Google should fix the root cause issue. I also have no idea how it could take several seconds to even display the built-in progress bar, that seems weird.

In short I don't think there's any action worth taking on Construct's side. But you can always configure your app manually in Android Studio if you want to do something like force the splash screen to close after 1 second (and possibly let the user see a blank screen for a while).

KekGames commented 1 year ago

Definitely showing black screen instead of the splash is not ideal and I don't suggest this as a general solution for Construct. There is an 8s+ delay between opening the game and showing the progress bar, and then the empty progress bar flashes for 0.1s before the game starts. Seems like most of the game is loaded even before the progress bar appears which defeats it's purpose, can it be realted to data.json file size which is 19MB in my case?

AshleyScirra commented 1 year ago

I've no idea how it could take 8 seconds to even load the progress bar. 19mb is a lot for data.json, but any modern device should be able to read 19mb of data in <1 second. Maybe the device is extremely slow?!

KekGames commented 1 year ago

@AshleyScirra it's 2021 Galaxy A52. An empty loading bar only flashes for 1 or a few frames before transitioning to game, so I don't think it does any loading. I could remove third-party plugins and send you my project via email for testing if you are interested. P.S. It's also available on Google Play, so if you'd like, you could quickly check if it loads the same way on other devices