Closed mmsh85 closed 3 years ago
Can you give some more background information in German would be also ok. Because we cannot reproduce this. Was the Risk Status Green or grey after the restart. What kind of restart you did. Can you rescan the QR code or is the code now invalid. Did you scan multiple QR code or remove tests?
Was the Risk Status Green or grey after the restart. After restart the default screen was back, which is appearing now as well. See attachment. When the scan was done though was just after downloading and before activating BT so I assume it was still grey as I activated BT afterwards.
What kind of restart you did. Via showing the running apps then swipe up CWA.
Can you rescan the QR code or is the code now invalid. Code is now invalid.
Did you scan multiple QR code or remove tests? No, only this one and cannot rescan it as stated above. Issue is that neither a test result is available nor rescanning is available as the QR code is used. It should be in the phone somehow, but not sure why it got disappeared
One other note: Phone had no sim card as clear in the screenshot
An you please explain in detail the Flow of Scanning and activating BT and what Color the Risk Score Card has
Hi Thomas,
I tried to replicate the process also on a second phone (surely the QR code scan part will not be possible):
Hope this explains it all.
You can also try to reproduce this with a test qr working
Test2:
Wasn't able to reproduce unfortunately, but at least I can recall the next steps now.
Note: Each QR code was scanned on a different phone, the second was scanned on the phone that reported the issue.
I can understand if not reproduced the case would be closed accordingly though.
Hi @mmsh85 ,
I assume that the swipe up to close the app could be the critical point, and that it might not be reproducable always. Could you please re-open the issue that we can track it further from here, please? Thanks for your great contribution!
@d4rken FYI: editor.apply vs. editor.commit?
@vaubaehn, @d4rken Ropened upon request.
If needed to support differently let me know too. I assume that unfortunately the first "correct" TAN has been sent, but now lost, if there is a way to confirm this at least then this would help, I assume :)
@mmsh85 Thanks for re-opening. To give a more detailed insight to what I assume what could have happened:
Let's see, what the devs will find out :)
.. and with a lost token; when the app restarts it detects no result. I would second that logical sequence, but why did it work with the test QR codes - timings then?
Issue is that irrespective to this, now I will wait till Monday to see the result status of my son :( Will be willing to support anyhow and thanks for the explanation!
My assumption is, that the swipe up won't always kill the app before data are written. It would depend on how fast you are (cowboy 😉), and how fast is Android (kind of a shoot out...) If the device is slow, or Android was particularly busy with other tasks at that moment, chances are better, that task is killed before data could be written. This will make reproducing/debugging quite hard. There were 2 other users (https://github.com/corona-warn-app/cwa-app-android/issues/642#issuecomment-691938723 https://github.com/corona-warn-app/cwa-app-android/issues/579#issuecomment-674405478) who were in a similar logic, but different from you completely crashing the encrypted storage.
I hope everything will turn out fine with your son! Stay safe and healthy, all the best!
@vaubaehn; thanks a-lot 😊
Hello @vaubaehn, thanks a ton for all the work into finding the root cause for this bug!!
I personally have not heard of these kind of race conditions happening when calling commit()
on shared preferences which is what we do on all of our interactions with LocalData.
Can you confirm that you have seen this behaviour in those cases as well?
We are currently working on refactoring big parts of the submission process and hope that if the issue is caused by code on our side these refactorings will fix the underlying issue. But if not, or if you can confirm this also happens with commit()
calls we can investigate the shared prefs again.
Hi @kolyaopahle ,
thanks a ton for all the work into finding the root cause for this bug!!
Sure, happy if I can help :heart:
I personally have not heard of these kind of race conditions happening when calling
commit()
on shared preferences
I would agree, that a race condition is quite unlikely when editing (Encrypted)SharedPreferences with edit.commit()
.
However, edit.apply()
is recommended as best practice, leaving it to Android to store the data when it thinks it's a good time to do. I assume edit.apply()
to be prone to a race condition, if the child task is killed before the data could have been written.
which is what we do on all of our interactions with LocalData.
I'm not sure if this is quite correct, that you use edit.commit()
on all interactions with EncryptedSharedPrefs. One (time) critical example is how you store the sqlite-db-password:
https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/util/security/SecurityHelper.kt#L90-L96
I must admit that I don't know how data for all other interactions are written. For me it looks like, with the declaration of
https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/util/security/EncryptedPreferencesFactory.kt#L22-L28
it's up to tink on how to store data at any time? I couldn't find any hint, that write interactions with EncryptedSharedPrefs are using edit.commit()
. If it's actually left to tink which method to use for storing, I assume it might be edit.apply()
, as this is generally recommended as best practice (but might be too slow in our case).
The only code fragment I found, that explicitly uses edit.commit()
, is clearing the SharedPrefs:
https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/util/security/SecurityHelper.kt#L82-L84
If there is any declaration in the code to use edit.commit()
in general for interactions with EncryptedSharedPreferences, I'd be happy, if you could point me to there.
Any supporting information from you on this subject would be highly appreciated for my own understanding and learning.
Can you confirm that you have seen this behaviour in those cases as well?
Unfortunately I don't have any development environment or any other testing setup, I can just read user reports, investigate in the code, and try to connect them logically.
We are currently working on refactoring big parts of the submission process and hope that if the issue is caused by code on our side these refactorings will fix the underlying issue.
This sounds great :)
But if not, or if you can confirm this also happens with
commit()
calls we can investigate the shared prefs again.
Let's see, just be sure that the usage of the globalEncryptedSharedPrefs-instance allows for submitting commits() by default. Fingers crossed for the refactoring :crossed_fingers: , and highest respect to the work of you all :heart:
Slightly different subject:
In a similar way I am wondering, how this works with the sqlite-db: Usually changes to the db (like updates to stored values) are "cached" or journalled, until the sqlite transaction receives a commit
. Just for my own understanding, is a commit
being sent, when changes/updates of CWA's db are 'applied', e. g. here:
https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/storage/tracing/TracingIntervalRepository.kt#L54-L61
Or how is generally ensured for CWA that any sqlite transaction completes fully (updates are comitted) to avoid any race condition?
Thank you, and cheers, V.
Hi @vaubaehn
If there is any declaration in the code to use edit.commit() in general for interactions with EncryptedSharedPreferences, I'd be happy, if you could point me to there.
we use the LocalData Object for most of our in-app persistence, https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/storage/LocalData.kt#L1-L733
in it we use the androidx.core.content.edit extension function to SharedPreferences
and pass true
to enable commit edit mode.
I'm not sure if this is quite correct, that you use edit.commit() on all interactions with EncryptedSharedPrefs. One (time) critical example is how you store the sqlite-db-password:
This actually does look quite interesting in relation to the known bug with sql database corruption, ill bring this up in our team meeting tomorrow and will update here if anyone has checked this already.
In a similar way I am wondering, how this works with the sqlite-db: Usually changes to the db (like updates to stored values) are "cached" or journalled, until the sqlite transaction receives a commit. Just for my own understanding, is a commit being sent, when changes/updates of CWA's db are 'applied', e. g. here:
I'm actually not quite sure about this one as i have not touched the sql part of the data persistency layer, from a quick glance on my part id assume that this is handled by the androidx room library. Running a quick google search i could not actually find any information on the persistence behaviour of room but i will also update after asking around in the team tomorrow.
Hi @kolyaopahle ,
in it we use the androidx.core.content.edit extension function to
SharedPreferences
and passtrue
to enable commit edit mode.
thanks a lot, I got it now!
In this case, should
https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/storage/LocalData.kt#L717-L733
line 729 be getSharedPreferenceInstance().edit(true) {...}
for consistency?
Anyway, if everything is persisted by commit()
, then the likelihood for a race condition should be minimized here. But still it's not impossible and maybe most plausible, if the device was slow due to background work (cleaning up and releasing ressources after scanning QR?) and/or because of a slow androidx.security.crypto/tink?
This actually does look quite interesting in relation to the known bug with sql database corruption, ill bring this up in our team meeting tomorrow and will update here if anyone has checked this already.
Looking forward for your update here.
I'm actually not quite sure about this one as i have not touched the sql part of the data persistency layer, from a quick glance on my part id assume that this is handled by the androidx room library. Running a quick google search i could not actually find any information on the persistence behaviour of room but i will also update after asking around in the team tomorrow.
Also looking forward for the update here. I wonder, if the sqlite-db is more at risk to be corrupted in general, for me it looks like #642 (comment) and #579 (comment) point into that direction.
Hey @vaubaehn,
line 729 be getSharedPreferenceInstance().edit(true) {...} for consistency?
absolutely, will change this, or if you want you can open a pr for it 👍
Anyway, if everything is persisted by commit(), then the likelihood for a race condition should be minimized here. But still it's not impossible and maybe most plausible, if the device was slow due to background work (cleaning up and releasing ressources after scanning QR?) and/or because of a slow androidx.security.crypto/tink?
From my investigation into the code there is no way to have seen the test result screen without https://github.com/corona-warn-app/cwa-app-android/blob/0f12f4ddbd814ac368901858363f7835bb30b9a8/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/storage/LocalData.kt#L559-L569 having been called. But as the code responsible for this logic is also targeted in the refactoring we can hopefully fix any race conditions appearing there.
Looking forward for your update here. Also looking forward for the update here. I wonder, if the sqlite-db is more at risk to be corrupted in general, for me it looks like #642 (comment) and #579 (comment) point into that direction.
we did some talking and identified a few more spots that might cause a race condition on App initialisation that could harm db password generation and persistence. We will continue to investigate those and i will update here and in the related tickets when anything of note comes up.
Hi @kolyaopahle ,
line 729 be getSharedPreferenceInstance().edit(true) {...} for consistency?
absolutely, will change this, or if you want you can open a pr for it +1
Sure I want to open a PR for it :) Can I base on release/1.5.x
?
Anyway, if everything is persisted by commit(), then the likelihood for a race condition should be minimized here. [...]
From my investigation into the code there is no way to have seen the test result screen without [
fun registrationToken()
] having been called. But as the code responsible for this logic is also targeted in the refactoring we can hopefully fix any race conditions appearing there.
Then it's quite mysterious what happened here... :ghost: Let's see then.
Looking forward for your update here.[...] we did some talking and identified a few more spots that might cause a race condition on App initialisation that could harm db password generation and persistence. We will continue to investigate those and i will update here and in the related tickets when anything of note comes up.
Sounds good. We'll all be happy to see increased stability in the future :) And getting us updated would be perfect! Thanks a lot!
Thank you, everyone, for your contribution here.
Looking at the comments, I would assume that the issue is resolved by the pull requests targetted to 1.6. However, I am not sure if we need a separate Jira Issue or if there are still open points in this Issue.
@d4rken what do you suggest?
@svengabr I don't think #1452 fixes this issue, it was only a tangentially related bug that was discovered while researching this.
Based on the discussion between @kolyaopahle and @vaubaehn we also spawned EXPOSUREAPP-3385 to investigate a possible racecondition on SharedPreferences and Database initialization, but this also does not scream "hey i'm the bug cause" to me, so I would keep this ticket open until we have a plausible explanation for what's happening here.
If it would help to test again as a test user with a couple of trials, I am willing to help as well. I would need a way to generate (or be provided) test QR codes though
@mmsh85
I would need a way to generate (or be provided) test QR codes though
Here you go:
Test 1
Test 2
If you need more, just generate QR codes with the following content: https://localhost/?133785-12245668-1234-4DA7-B166-B86D8737xxxx
where you can replace the xxxx
by any 4 digit number you like…
(you can actually change more than that but I haven't really tested everything and this should anyway give you enough codes you can use for testing).
Internal Tracking ID: EXPOSUREAPP-2126
Version 1.12.0, which has now been released, includes the fix "Small LocalData cleanup (DEV, EXPOSUREAPP-2126)" #2183 which has been linked to this issue.
It seems it was not easy to reproduce the issue, so it is probably also not easy to test if the fix has resolved the original issue.
Version 1.12.0, which has now been released, includes the fix "Small LocalData cleanup (DEV, EXPOSUREAPP-2126)" #2183 which has been linked to this issue.
Unfortunately, I don't expect the change to have fixed this. It was just something I noticed while searching for a cause.
@d4rken
Unfortunately, I don't expect the change to have fixed this. It was just something I noticed while searching for a cause.
Thanks for clarifying! I was not assuming that the root cause had been fixed.
It seems only @mmsh85 has reported this problem here and that was four months ago. Have there been any other reports since then via other channels?
It seems only @mmsh85 has reported this problem here and that was four months ago. Have there been any other reports since then via other channels?
Issue was reported in version 1.3.1 actually. I tried to reproduce the issue starting 1.5 with the dummy QR codes provided, but wasn't successful since then. Hope this helps!
I will close the topic unless commented that it should still be kept (for other tracking reasons).
@mmsh85 Many thanks for contributing here. Should any further problem occur please re-open or create a new issue, as appropriate. Thanks, best wishes, DS
Corona-Warn-App Open Source Team
Good Morning,
I first liked the issue to the QR code scanning and response problem; but I consider this issue is new and not the same.
Describe the bug
Result status disappeared from the app after restarting it. My son got tested on Thursday 15.10 We installed the app on his phone and scanned the QR code as soon as we returned.
OK -> After the scan; the first time the app showed that the "result pending" NOK ->After restarting; the scanned QR disappeared (app shows "Have you been tested?" screen)
Expected behaviour
After restarting; the scanned tan still should be appearing in the app showing the test status
Steps to reproduce the issue
Cannot reproduce as scanning only available once, but the procedure was:
Technical details
Additional context
Internal Tracking ID: EXPOSUREAPP-2126