subconsciousnetwork / subconscious

Apache License 2.0
9 stars 0 forks source link

`Read corrupted data at file offset None backtrace ()` #693

Closed bfollington closed 1 year ago

bfollington commented 1 year ago

A mysterious error that spontaneously occurred during development. Appears that somehow the on-disk data has been corrupted. After profiling disk use in Instruments and examining the files on disk it seems like the app can read data as expected, but the data itself is malformed.

Full dump of the application data is attached, along with the (mostly unhelpful) logs:

dump.zip

2023-06-08 12:24:55.034032+1000 Subconscious[79520:12172927] [NoosphereService] init NoosphereService
2023-06-08 12:24:55.034122+1000 Subconscious[79520:12172927] [NoosphereService] Global storage URL: file:///Users/ben/Library/Developer/CoreSimulator/Devices/01619CFE-B226-41AB-9391-7B85DA80F25E/data/Containers/Data/Application/4F973352-825A-4D6F-90DF-243925B389A6/Library/Application%20Support/noosphere
2023-06-08 12:24:55.034185+1000 Subconscious[79520:12172927] [NoosphereService] Sphere storage URL: file:///Users/ben/Library/Developer/CoreSimulator/Devices/01619CFE-B226-41AB-9391-7B85DA80F25E/data/Containers/Data/Application/4F973352-825A-4D6F-90DF-243925B389A6/Library/Application%20Support/sphere
2023-06-08 12:24:55.034235+1000 Subconscious[79520:12172927] [NoosphereService] Gateway URL: https://bdyqnyx47jvtoggasueyhqgoz7dirnl3.sphere.test.subconscious.cloud/
2023-06-08 12:24:55.034267+1000 Subconscious[79520:12172927] [NoosphereService] Sphere identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.037745+1000 Subconscious[79520:12172927] [app] Set sphere ID: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.109682+1000 Subconscious[79520:12172927] [app] Documents: file:///Users/ben/Library/Developer/CoreSimulator/Devices/01619CFE-B226-41AB-9391-7B85DA80F25E/data/Containers/Data/Application/4F973352-825A-4D6F-90DF-243925B389A6/Documents/
2023-06-08 12:24:55.109752+1000 Subconscious[79520:12172927] [app] Sphere ID: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.109814+1000 Subconscious[79520:12172927] [app] Noosphere enabled? true
2023-06-08 12:24:55.109910+1000 Subconscious[79520:12172927] [app] Readying database
2023-06-08 12:24:55.110238+1000 Subconscious[79520:12173008] [NoosphereService] init Noosphere
2023-06-08 12:24:55.110641+1000 Subconscious[79520:12173008] [Noosphere] init
2023-06-08 12:24:55.110691+1000 Subconscious[79520:12173008] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.110898+1000 Subconscious[79520:12172927] [app] [action] appear
2023-06-08 12:24:55.111298+1000 Subconscious[79520:12172927] [notebook] [action] appear
2023-06-08 12:24:55.122141+1000 Subconscious[79520:12173013] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.128490+1000 Subconscious[79520:12172927] Metal GPU Frame Capture Enabled
2023-06-08 12:24:55.128629+1000 Subconscious[79520:12172927] Metal API Validation Enabled
2023-06-08 12:24:55.201825+1000 Subconscious[79520:12172927] [app] [action] notifyNoosphereEnabled(true)
2023-06-08 12:24:55.202190+1000 Subconscious[79520:12173008] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.202469+1000 Subconscious[79520:12172927] [notebook] [action] ready
2023-06-08 12:24:55.202523+1000 Subconscious[79520:12172927] [app] [action] succeedMigrateDatabase(1683191580)
2023-06-08 12:24:55.202563+1000 Subconscious[79520:12172927] [app] Database version: 1683191580
2023-06-08 12:24:55.202876+1000 Subconscious[79520:12172927] [app] Database last-known sphere state: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE @ bafyr4icusybbaadlekoej2nviimbl5sztwbx2p43bfsubbe6s4bu6pfesi
2023-06-08 12:24:55.202939+1000 Subconscious[79520:12172927] [app] Syncing with gateway: https://bdyqnyx47jvtoggasueyhqgoz7dirnl3.sphere.test.subconscious.cloud/
2023-06-08 12:24:55.203000+1000 Subconscious[79520:12172927] [app] File sync started
2023-06-08 12:24:55.203437+1000 Subconscious[79520:12172927] [app] [action] failFetchNicknameFromProfile("Foreign Error: Read corrupted data at file offset None backtrace ()")
2023-06-08 12:24:55.203474+1000 Subconscious[79520:12172927] [app] Failed to read nickname from profile: Foreign Error: Read corrupted data at file offset None backtrace ()
2023-06-08 12:24:55.203585+1000 Subconscious[79520:12172927] [app] [action] failRefreshSphereVersion("Foreign Error: Read corrupted data at file offset None backtrace ()")
2023-06-08 12:24:55.203618+1000 Subconscious[79520:12172927] [app] Failed to refresh sphere version: Foreign Error: Read corrupted data at file offset None backtrace ()
2023-06-08 12:24:55.204196+1000 Subconscious[79520:12172927] [notebook] [action] setEntryCount(5)
2023-06-08 12:24:55.207912+1000 Subconscious[79520:12172927] [app] [action] succeedSyncLocalFilesWithDatabase(...) 0 items
2023-06-08 12:24:55.208067+1000 Subconscious[79520:12172927] [notebook] [action] ready
2023-06-08 12:24:55.208149+1000 Subconscious[79520:12173008] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.208149+1000 Subconscious[79520:12172927] [app] File sync finished: []
2023-06-08 12:24:55.211186+1000 Subconscious[79520:12172927] [app] [action] scenePhaseChange(SwiftUI.ScenePhase.active)
2023-06-08 12:24:55.211833+1000 Subconscious[79520:12172927] [notebook] [action] setEntryCount(5)
2023-06-08 12:24:55.211921+1000 Subconscious[79520:12172927] [notebook] [action] search(setSuggestions(8 items))
Update NavigationAuthority bound path tried to update multiple times per frame.
2023-06-08 12:24:55.213008+1000 Subconscious[79520:12173013] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.214210+1000 Subconscious[79520:12172927] [app] [action] failSyncSphereWithGateway("Foreign Error: Read corrupted data at file offset None backtrace ()")
2023-06-08 12:24:55.214282+1000 Subconscious[79520:12172927] [app] Sphere failed to sync with gateway: Foreign Error: Read corrupted data at file offset None backtrace ()
2023-06-08 12:24:55.217999+1000 Subconscious[79520:12173018] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.222082+1000 Subconscious[79520:12172927] [notebook] [action] search(setSuggestions(8 items))
2023-06-08 12:24:55.223061+1000 Subconscious[79520:12173013] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.225256+1000 Subconscious[79520:12172927] [notebook] [action] setRecent(2 items)
2023-06-08 12:24:55.228054+1000 Subconscious[79520:12173008] [NoosphereService] init Sphere with identity: did:key:z6MkemLNbFXrN6vUgQJtDUdiow2BLfSHbwtP7nNVNrNpKnEE
2023-06-08 12:24:55.255472+1000 Subconscious[79520:12172927] [notebook] [action] setRecent(2 items)
2023-06-08 12:24:55.255602+1000 Subconscious[79520:12172927] [app] [action] failIndexOurSphere("Foreign Error: Read corrupted data at file offset None backtrace ()")
2023-06-08 12:24:55.255678+1000 Subconscious[79520:12172927] [app] Database failed to sync with sphere: Foreign Error: Read corrupted data at file offset None backtrace ()
cdata commented 1 year ago

Well, this is terrifying. I'll take a look soon, but just off the cuff this looks like it is an error originating in the bowels of our database dependency Sled.

bfollington commented 1 year ago

We should investigate whether we can trigger this from the app by rapidly spamming write, follow, unfollow, save, sync etc. operations... since that's the best guess I have for what triggered the instance of this I saw.

cdata commented 1 year ago

I accidentally reproduced this in a simulator:

2023-06-19 09:21:13.754066-0700 Subconscious[3444:47377278] [app] [action] failIndexOurSphere("Foreign Error: Read corrupted data at file offset None backtrace ()")
2023-06-19 09:21:13.754149-0700 Subconscious[3444:47377278] [app] msg="Failed to index our sphere to database" error="Foreign Error: Read corrupted data at file offset None backtrace ()"

Definitely seems like an urgent total-data-loss kind of bug...

cdata commented 1 year ago

Been digging a little deeper. I was able to reproduce this deliberately by rapidly adding follows and clicking "sync" multiple times. Logs: https://gist.githubusercontent.com/cdata/a3e1331636ddcaba5abd9275dcc8ff5e/raw/bfeb8e918705cbcaf64986bf3a2265ac9cd26d59/subconscious.log

cdata commented 1 year ago

A new lead: peering at the logs, I noticed that Subconscious appears to be re-initializing Noosphere in the midst of a lot of synchronization tasks.

This seems like a plausible surface area for database corruption.

Putting aside the fact that Noosphere should not be re-initialized this way, I will offer: we should make Noosphere more resilient to this case (that is, disposing of Noosphere should ensure that cleanup is performed).

That having been said: I think we need to do a pass in Subconscious to make sure that Noosphere constructs aren't being pathologically de-initialized and re-initialized during the lifetime of the user session.

gordonbrander commented 1 year ago

cdata: @gordon FYI I think Noosphere is being unnecessarily de-initialized and re-initialized during the normal course of app state change https://github.com/subconsciousnetwork/subconscious/issues/693#issuecomment-1607965007

cdata: It's usually preceded by a log suggesting the gateway URL has changed (even though it never changes), so that might be an incorrectly handled state change

gordon: Aha! Yes, we have to re-initialize Noosphere whenever the gateway URL changes. So we could be sending that message when we should not be.

bfollington commented 1 year ago

Closing this for now, I think we've fixed this but the future-proof solution is https://github.com/subconsciousnetwork/noosphere/issues/455