Closed stevesoltys closed 11 months ago
Awesome work! This looks a lot better than the previous attempt at fixing this. There's still some ktlint style complaints to fix. I'll also run this through manual tests before merging (as this is a rather critical code path).
I could reproduce the issue that this MR is fixing by installing more than 500 apps:
2023-11-14 11:17:54.932 1522-2762 BackupManagerService pid-1522 I Full restore; asking about 515 apps
2023-11-14 11:17:54.942 1522-2762 BackupManagerService pid-1522 D Starting restore.
2023-11-14 11:17:54.943 1522-2762 BackupRestoreController pid-1522 I System restore starting for user: 0
2023-11-14 11:17:54.948 10154-10154 Configurab...ortService com.stevesoltys.seedvault D Service created.
2023-11-14 11:17:54.953 10154-10154 Configurab...ortService com.stevesoltys.seedvault D Transport bound.
2023-11-14 11:17:54.956 1522-1522 TransportConnection pid-1522 I ConfigurableBackupTransportService#1453: Notifying [PerformUnifiedRestoreTask.startRestore()] transport = BackupTransportClient
2023-11-14 11:17:54.969 1522-2762 BpBinder pid-1522 W Large outgoing transaction of 826056 bytes, interface descriptor , code 14
2023-11-14 11:17:54.969 1522-2762 JavaBinder pid-1522 E !!! FAILED BINDER TRANSACTION !!! (parcel size = 826056)
2023-11-14 11:17:54.970 1522-2762 BackupManagerService pid-1522 E Unable to contact transport for restore: data parcel size 826056 bytes
2023-11-14 11:17:15.769 2762-2762 binder pid-2762 I 1522:2762 transaction failed 29201/-28, size 826056-8 line 3360
2023-11-14 11:17:54.971 10154-10211 RestoreCoordinator com.stevesoltys.seedvault D finishRestore
2023-11-14 11:17:54.972 1522-2762 BackupRestoreController pid-1522 I systemRestoreFinished for 0
2023-11-14 11:17:54.972 1522-2762 BackupRestoreController pid-1522 I maybeSendWidgetRestoreBroadcasts for 0
2023-11-14 11:17:54.972 1522-2762 BackupManagerService pid-1522 I Restore complete.
2023-11-14 11:17:54.974 1522-2762 BackupManagerService pid-1522 V [UserID:0] Released wakelock:*backup*-0-2762
2023-11-14 11:17:54.975 10154-10154 Configurab...ortService com.stevesoltys.seedvault D Service destroyed.
2023-11-14 11:17:55.054 1522-18904 RestoreSession pid-1522 D endRestoreSession
2023-11-14 11:17:55.054 1522-2762 BackupManagerService pid-1522 V [UserID:0] Clearing restore session and halting timeout
Something isn't working right here. I got some error in the beginning which showed an error message in the UI and unlocked the FINISH button, however probably due to the new code, the restore continued and was actually doing stuff.
Then this pattern keeps repeating over and over:
2023-11-14 12:37:33.459 14334-14349 RestoreCoordinator com.stevesoltys.seedvault I Start restore with [@pm@, com.android.rkpdapp, com.caydey.ffshare, com.gianlu.aria2android, com.android.traceur, org.woheller69.audio_analyzer_for_android]
...
2023-11-14 12:37:40.245 14334-14349 RestoreCoordinator com.stevesoltys.seedvault I Start restore with [@pm@, org.xcsoar, deltazero.amarok.foss, com.android.settings, com.android.htmlviewer]
Somehow for each of those it is getting all restore sets, skipping over the one it can't decrypt. Why is it doing a @pm@
restore for each set? Is this needed? What is causing it to re-load restore sets for each chunk?
The error I got in the beginning may be a bug unrelated to this PR, us starting the restore session twice:
2023-11-14 12:53:35.697 16771-16812 RestoreViewModel com.stevesoltys.seedvault D Start InstallResult Flow
2023-11-14 12:53:35.713 16771-16812 RestoreViewModel com.stevesoltys.seedvault D Starting new restore session to restore backup 1699970940126
2023-11-14 12:53:35.714 1522-1625 BackupManagerService pid-1522 V [UserID:0] beginRestoreSession: pkg=null transport=com.stevesoltys.seedvault.transport.ConfigurableBackupTransport
2023-11-14 12:53:35.716 16771-16813 RestoreViewModel com.stevesoltys.seedvault D Starting new restore session to restore backup 1699970940126
2023-11-14 12:53:35.717 1522-1560 BackupManagerService pid-1522 V [UserID:0] beginRestoreSession: pkg=null transport=com.stevesoltys.seedvault.transport.ConfigurableBackupTransport
2023-11-14 12:53:35.717 16771-16814 RestoreViewModel com.stevesoltys.seedvault D Completed InstallResult Flow
2023-11-14 12:53:35.751 16771-16771 Configurab...ortService com.stevesoltys.seedvault D Service created.
2023-11-14 12:53:35.753 16771-16771 Configurab...ortService com.stevesoltys.seedvault D Transport bound.
2023-11-14 12:53:35.757 1522-1625 BackupManagerService pid-1522 I [UserID:0] Restore session requested but one already active
2023-11-14 12:53:35.760 1522-1560 BackupManagerService pid-1522 V [UserID:0] Acquired wakelock:*backup*-0-2762
2023-11-14 12:53:35.760 16771-16812 RestoreViewModel com.stevesoltys.seedvault E Error starting new session
android.os.RemoteException: beginRestoreSessionForUser returned null
at com.stevesoltys.seedvault.restore.RestoreViewModel.getOrStartSession(RestoreViewModel.kt:132)
at com.stevesoltys.seedvault.restore.RestoreViewModel.startRestore(RestoreViewModel.kt:199)
at com.stevesoltys.seedvault.restore.RestoreViewModel.access$startRestore(RestoreViewModel.kt:74)
at com.stevesoltys.seedvault.restore.RestoreViewModel$onNextClickedAfterInstallingApps$1.invokeSuspend(RestoreViewModel.kt:183)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.internal.LimitedDispatcher.run(LimitedDispatcher.kt:42)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:95)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
For some reason, the backup metadata isn't cached, so in RestoreCoordinator#startRestore()
, we end up calling getAvailableMetadata()
for all chunks.
Given this and the fact that each chunks also does a @pm@
restore, we should probably increase the chunk size as much as can be considered safe.
The optimization for not having to re-fetch the backup metadata was to call restoreCoordinator.beforeStartRestore(restorableBackup.backupMetadata)
which is only done once in this PR, not before each start.
@stevesoltys are you still following? So action items I see:
beforeStartRestore()
before each startbeginRestoreSession()
twice.
Copy of #484 with review comment changes.