status-im / status-mobile

a free (libre) open source, mobile OS for Ethereum
https://status.app
Mozilla Public License 2.0
3.9k stars 986 forks source link

[Android] The app responses with delay after returning from the background. #17282

Closed VolodLytvynenko closed 1 year ago

VolodLytvynenko commented 1 year ago

Steps:

  1. Send the app to the background.
  2. Wait for at least 2-3 minutes.
  3. Return from the background.
  4. Attempt to interact with any UI element.

Actual result:

The UI components are not responding or respond with the dealy

https://github.com/status-im/status-mobile/assets/52490791/9971dbec-b957-4d9d-869b-1b01640aa682

Expected result:

The components respond immediately

Device:

Additional info:

This issue is not reproducible for all devices. @churik and @qoqobolo also tried it with Pixel 7a, Android 13 and do not experience this problem, while @pavloburykh encounter wit it.

ENV:

Release build: [13 Sep 2023, 11:20]

Logs:

Status.log geth.log android_logcat3.txt

churik commented 1 year ago

Yes, I'm able to reproduce it as well.

App is hardly responsible after backgroud for 5-30 sec (depending on device)

https://github.com/status-im/status-mobile/assets/4557972/c80356ac-3002-4678-9f7e-2b795d25998d

churik commented 1 year ago

The steps are: create new account and send the app to the backgroud for several minutes.

Reproduced in 1.25.0 RC2 with https://github.com/status-im/status-mobile/commit/f6ac99647e567894c52bf00260201ac7332df126 commit

Note: also reproducible with RC1

qoqobolo commented 1 year ago

This issue is not reproducible for all devices. @churik and @qoqobolo also tried it with Pixel 7a, Android 13 and do not experience this problem, while @pavloburykh encounter wit it.

It turned out that it can be only reproducible for me on newly created profiles (Pixel, Huawei). UI becomes slow immediately after creation, but not so critically; but after putting the app to background, I can see pretty noticeable performance degradation.

flexsurfer commented 1 year ago

Andrea: it's a problem most likely we re-rendering of contract communities, I see a lot of:

09-13 18:31:16.949 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:16.973 15234 15320 D StatusModule: Signal event
09-13 18:31:20.681 15234 15329 D StatusModule: Signal event
09-13 18:31:21.443 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:24.148 15234 15329 D StatusModule: Signal event
09-13 18:31:25.864 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:27.544 15234 15326 D StatusModule: Signal event
09-13 18:31:30.337 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:31.016 15234 15329 D StatusModule: Signal event
09-13 18:31:34.709 15234 15330 D StatusModule: Signal event
09-13 18:31:34.819 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:38.028 15234 15329 D StatusModule: Signal event
09-13 18:31:39.300 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:41.558 15234 15329 D StatusModule: Signal event
09-13 18:31:43.787 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
09-13 18:31:45.143 15234 15330 D StatusModule: Signal event
09-13 18:31:46.881 15234 15328 D StatusModule: Signal event
09-13 18:31:48.314 15234 15337 D ReactNativeJS: DEBUG [utils.re-frame:21] - Handling re-frame event:  :fetched-contract-communities
0

It's calling that too many times

flexsurfer commented 1 year ago

"curated.communities.update" (rf/dispatch [:fetched-contract-communities it seems like we have lot of curated.communities.update signals in the background and they are accumulated?

flexsurfer commented 1 year ago

hm but if its replicated for new account, seems like communities not the reason

flexsurfer commented 1 year ago

I'm unable to replicate on latest develop, could someone from QA confirm that it's fixed in latest nighlty?

pavloburykh commented 1 year ago

I'm unable to replicate on latest develop, could someone from QA confirm that it's fixed in latest nighlty?

@flexsurfer I confirm that issue is still reproducible in latest nightly

Android 12, Samsung Galaxy A52

Status-debug-logs (56).zip

https://github.com/status-im/status-mobile/assets/97245802/e1cf137e-241c-4130-8005-5a10fd125563

flexsurfer commented 1 year ago

lots of

t=2023-09-14T09:43:23+0000 lvl=info msg="syncing topic" chatIDs="[04da1a3ff90cd363e62701c4d712b8f9aaad01ade0472f923254b66452d2d5a0a062c871698334620170ab57a89aff9f20f9a1218490e77e5ad2db64eff2b37194 04da1a3ff90cd363e62701c4d712b8f9aaad01ade0472f923254b66452d2d5a0a062c871698334620170ab57a89aff9f20f9a1218490e77e5ad2db64eff2b37194 04da1a3ff90cd363e62701c4d712b8f9aaad01ade0472f923254b66452d2d5a0a062c871698334620170ab57a89aff9f20f9a1218490e77e5ad2db64eff2b37194 0x025d27e58f1bece7d9675e6fe907da6e3b5007f06a327dd20db04d68851b9c153d 0x03769999b26cfd88788c882b56232fa5f58735795bdd463820d127b9a23d4415d4]" fromString=2023-09-13T09:43:23Z toString=2023-09-14T09:43:23Z topic="[0xe41d5d0c 0x0fa3265a 0xd16d5285 0x6ba82a67 0xde6e06b5]" from=1,694,598,203 to=1,694,684,603
t=2023-09-14T09:43:23+0000 lvl=info msg="Got random peer from peerstore" peer=16Uiu2HAmSve7tR5YZugpskMv2dmJAsMUKmfWYEKRXNUxRaTCnsXV
t=2023-09-14T09:43:23+0000 lvl=info msg="querying message history" peer=16Uiu2HAmDQugwDHM3YeUp86iGjrUvbdw3JPRgikC7YoGBsT2ymMg
t=2023-09-14T09:43:24+0000 lvl=info msg="received waku2 store message" envelopeHash=0x33e79af1eeeabd09f38ce812753603560366512af91d6dccd00262730dfbc228 pubsubTopic=/waku/2/default-waku/proto

dunnot if its related @cammellos

cammellos commented 1 year ago

@flexsurfer that's probably the curated communities code, when we fetch them, we sync them as well

cammellos commented 1 year ago

@flexsurfer not sure curated communities is the issue, they are fetched every 3s, and that's roughly what we are seeing in the logs, I don't think that should cause such performance issue, what do you think?

cammellos commented 1 year ago

mmh, actually, sorry, looks like there's too many, odd

pavloburykh commented 1 year ago

@cammellos @flexsurfer I have checked previous nightly builds.

It turned out that bug is reproducible on builds starting from Aug 8, 2023.

I see that PR related to changes in communities fetching https://github.com/status-im/status-mobile/pull/16829 has been merged on Aug 7, 2023.

So I suspect that issue indeed can be related to communities fetching.

But I wonder why this issue started to reproduce only recently (couple of days ago). It's hard to believe that we didn't notice the issue before.

flexsurfer commented 1 year ago

another question is why it reproduces only on some devices

flexsurfer commented 1 year ago

| pavloburykh commented now | @flexsurfer issue https://github.com/status-im/status-mobile/issues/17282 is fixed here.

| Discovery communities are not fetched, but it is expected as curated.communities.update is removed. At least now we know what causes the issue.

so its "curated.communities.update" but why it started spamming? and what we can do, can we debounce it on status-go ? @cammellos

cammellos commented 1 year ago

@flexsurfer it's odd, seems like signals are queuing up on the frontend, status-go publishes is every 3s, but then in the frontend they are all queued up, most likely there's some performance issue (that's why probably we see it only on some devices)

flexsurfer commented 1 year ago
image
flexsurfer commented 1 year ago

image

flexsurfer commented 1 year ago

here https://github.com/status-im/status-mobile/pull/16232/files#diff-245c6ff6b79ad9c6d6bb506b8d8d467a6c0412f6fe950d91ac7cd03b9bb95073R18

flexsurfer commented 1 year ago
(defn rename-contract-community-key
  [k]
  (let [s                  (name k)
        lower-cased        (csk/->kebab-case-string s)
        starts-with-digit? (re-matches #"^\d.*" s)
        predicate?         (some #(string/starts-with? lower-cased %)
                                 ["can-" "is-"])]
    (cond starts-with-digit? s
          predicate?         (keyword (str lower-cased "?"))
          :else              (keyword lower-cased))))

(defn rename-contract-community-keys
  [m]
  (reduce (fn [acc [k v]]
            (let [new-key (if (keyword? k) (rename-contract-community-key k) k)]
              (cond
                (map? v) (assoc acc new-key (rename-contract-community-keys v))
                :else    (assoc acc new-key v))))
          {}
          m))

why we're doing this in cljs, can this be moved to go ? @vkjr @cammellos

flexsurfer commented 1 year ago

also we shouldn't send same data every 3s from status-go, but only updated

flexsurfer commented 1 year ago

to summarize, currently, we send the same data every 3s, and the event handler takes 1.5 sec in status-react, in the background it seems like there is less resources, so it starts accumulating

churik commented 1 year ago

@cammellos @flexsurfer

May we merge temp remove Pr https://github.com/status-im/status-mobile/pull/17286 in order to unblock develop for now?

vkjr commented 1 year ago

@flexsurfer, wow, thanks for finding it! In general, I don't think that status-go should prepare keys for clojure part. And agree that status-go can send only updates. Wondering if the memoization of rename-contract-community-key can be a quick temporary fix. Wdyt, @flexsurfer?

cammellos commented 1 year ago

@flexsurfer, wow, thanks for finding it! In general, I don't think that status-go should prepare keys for clojure part. And agree that status-go can send only updates. Wondering if the memoization of rename-contract-community-key can be a quick temporary fix. Wdyt, @flexsurfer?

@vkjr in this case the fix is simple I think, if we want to rename keys, we need to do it the stupid way :) it's not bullet proof against changes, but on mobile we can't afford regexs/something too dynamic. We had similar issues with messages where for example we try to use js as much as possible because of these issues (clojurescript is very slow compared to pure js)

ilmotta commented 1 year ago

@vkjr, @cammellos, @flexsurfer you guys are correct about those functions. They are terribly heavy and slow. I was too curious to understand the performance problems with those functions and ran a quick comparison (it's not the right way to compare benchmarks, but the differences are so significant that I think it's worth sharing).

The original implementation runs ~50x slower in my tests. Although I can't comment on the reasons, I think it would be interesting if someone wants to really understand what makes them so slow. Is it because rename-contract-community-keys makes recursive calls to itself? Is it the regular expression? Is it the call to csk/->kebab-case-string? It's worth pinpointing exactly the cause after the performance fix is merged.

;; The thing we want to transform
(def payload
  {:contractCommunities
   ["0x032aa2439b14eb2caaf724223951da89de1530fbfa5d5a639b93b82cd5341713fd"
    "0x025d27e58f1bece7d9675e6fe907da6e3b5007f06a327dd20db04d68851b9c153d"
    "0x03769999b26cfd88788c882b56232fa5f58735795bdd463820d127b9a23d4415d4"]
   :contractFeaturedCommunities
   ["0x03769999b26cfd88788c882b56232fa5f58735795bdd463820d127b9a23d4415d4"]
   :communities
   {:0x025d27e58f1bece7d9675e6fe907da6e3b5007f06a327dd20db04d68851b9c153d {:name "foo"
                                                                           :tokenPermissions nil
                                                                           :isMember false
                                                                           :canDeleteMessageForEveryone
                                                                           false}
    :0x032aa2439b14eb2caaf724223951da89de1530fbfa5d5a639b93b82cd5341713fd
    {:name "bar"
     :adminSettings {:pinMessageAllMembersEnabled false}
     :chats
     {:8aa21892-a768-488a-8655-7ed31f527642
      {:name       "web"
       :categoryID "d87621f7-315a-4579-a800-fde42d86ddf2"
       :canPost    false}
      :6efc1767-22e5-46c1-9865-87206e04c9f9
      {:name       "mobile-ui"
       :categoryID "7445461a-bd10-4c54-86fe-70403913faf4"
       :canPost    false}}}
    :0x03769999b26cfd88788c882b56232fa5f58735795bdd463820d127b9a23d4415d4
    {:name "baz"
     :categories
     {:18444776-8720-4955-b200-a55758889f43 {:name "bar" :position 0}
      :093b4684-92f0-42dd-977c-05affc451ec8 {:name "foo" :position 1}}
     :members
     {:0x04a25e0a58ab97c6f93298ad87ab73aae5995365088a1ba150633ea25ff0b80cc1ddba18b122d117e477963cec1962219362a9a77226a6bf9a15d8d02c35fe8c9b
      {}
      :0x0490d2bb47388504e4b615052566e5830662bf202eb179251e9118587ce628c6c76e1f4550f9cd52058cf9dbdb5b788eea10b7c765cd7565675daa5f822acab8f4
      {}}}}
   :unknownCommunities nil})

;; A basic benchmark running in and Android emulator
(simple-benchmark
 [m payload]
 (rename-contract-community-keys m)
 1000)
;; => 10671ms (very slow)

;; Same style benchmark, in an Android emulator
;; Notice that I extracted functions to avoid reallocating them all the time, and I'm surely not covering all the possible key renames, so take with a grain of salt.
(let [community-renamer {:tokenPermissions            :token-permissions
                         :isMember                    :member?
                         :adminSettings               :admin-settings
                         :canDeleteMessageForEveryone :delete-message-for-everyone?}
      chat-renamer      {:categoryID :category-id
                         :canPost    :can-post?}
      update-chat       (fn [chat]
                          (set/rename-keys chat chat-renamer))
      update-community  (fn [community]
                          (-> community
                              (set/rename-keys community-renamer)
                              (update :chats update-vals update-chat)))]
  (simple-benchmark
   [m payload]
   (-> m
       (set/rename-keys {:contractCommunities         :contract-communities
                         :contractFeaturedCommunities :contract-featured-communities
                         :unknownCommunities          :unknown-communities})
       (update :communities update-vals update-community))
   1000))
;; => 203ms
flexsurfer commented 1 year ago
time kebab 7.015041000006022 ms
 time rematch 0.015291999996406958 ms
 time some 0.028582999992067926 ms
flexsurfer commented 1 year ago

sum

kebab 1245.5989960007137ms
rematch 3.590136999846436 ms
some 7.79713099921355 ms
flexsurfer commented 1 year ago

https://github.com/clj-commons/camel-snake-kebab/issues/1 , issue number 1 :)

flexsurfer commented 1 year ago

cc @ilmotta @vkjr

ilmotta commented 1 year ago

Thanks @flexsurfer. I don't quite understand these benchmarks you posted. Are you comparing functions in isolation?

What I tried to share previously is a working implementation to compare apples to apples to give a more meaningful comparison. What we should do if we had the time (I'm not doing this now) is actually profile the functions in the correct runtime (not like I did in the emulator).

I remember I had seen this memoization mention in the csk's website:

If you’re going to do case conversion in a hot spot, use core.memoize to avoid doing the same conversions over and over again.

But clojure.core.memoize does not support ClojureScript, so yeah, let's forget about it in critical code paths.

flexsurfer commented 1 year ago

I measured times in the app, on real data, so for QA on the device for each rename-contract-community-keys summary calls of (csk/->kebab-case-string s) will be ~1,5 sec

ilmotta commented 1 year ago

I measured times in the app, on real data, so for QA on the device for each rename-contract-community-keys summary calls of (csk/->kebab-case-string s) will take ~1,5 sec

Cool, I understand now your benchmarks. Thanks.