apollographql / apollo-kotlin

:rocket:  A strongly-typed, caching GraphQL client for the JVM, Android, and Kotlin multiplatform.
https://www.apollographql.com/docs/kotlin
MIT License
3.77k stars 656 forks source link

SQLiteBlobTooBigException: Row too big to fit into CursorWindow #4072

Open asimonigh opened 2 years ago

asimonigh commented 2 years ago

Since the migration from Apollo 2 .5.11 to Apollo 3.1.0, many users are getting SQLiteBlobTooBigException when executing queries. I didn't found any scenarios to reproduce the issue and users get the issue with several queries.

Here is the stack trace:

Caused by android.database.sqlite.SQLiteBlobTooBigException: Row too big to fit into CursorWindow requiredPos=0, totalRows=1
       at android.database.sqlite.SQLiteConnection.nativeExecuteForCursorWindow(SQLiteConnection.java)
       at android.database.sqlite.SQLiteConnection.executeForCursorWindow(SQLiteConnection.java:942)
       at android.database.sqlite.SQLiteSession.executeForCursorWindow(SQLiteSession.java:838)
       at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:62)
       at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:155)
       at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:142)
       at android.database.AbstractCursor.moveToPosition(AbstractCursor.java:232)
       at android.database.AbstractCursor.moveToNext(AbstractCursor.java:287)
       at com.squareup.sqldelight.android.AndroidCursor.next(AndroidSqliteDriver.kt:273)
       at com.squareup.sqldelight.Query.executeAsList(Query.kt:135)
       at com.apollographql.apollo3.cache.normalized.sql.internal.CacheQueriesHelpers.selectRecord(CacheQueriesHelpers.java:11)
       at com.apollographql.apollo3.cache.normalized.sql.SqlNormalizedCache.loadRecord(SqlNormalizedCache.kt:24)
       at com.apollographql.apollo3.cache.normalized.api.MemoryCache$loadRecord$1.invoke(MemoryCache.kt:47)
       at com.apollographql.apollo3.cache.normalized.api.MemoryCache$loadRecord$1.invoke(MemoryCache.kt:40)
       at com.apollographql.apollo3.cache.normalized.api.internal.CacheLock.lock(CacheLock.java:5)
       at com.apollographql.apollo3.cache.normalized.api.MemoryCache.loadRecord(MemoryCache.kt:40)
       at com.apollographql.apollo3.cache.normalized.api.MemoryCache.loadRecords(MemoryCache.kt:56)
       at com.apollographql.apollo3.cache.normalized.api.internal.OptimisticCache.loadRecords(OptimisticCache.kt:27)
       at com.apollographql.apollo3.cache.normalized.api.internal.CacheBatchReader.toMap(CacheBatchReader.java:93)
       at com.apollographql.apollo3.cache.normalized.api.OperationCacheExtensionsKt.readInternal(OperationCacheExtensionsKt.java:76)
       at com.apollographql.apollo3.cache.normalized.api.OperationCacheExtensionsKt.readDataFromCache(OperationCacheExtensionsKt.java:39)
       at com.apollographql.apollo3.cache.normalized.internal.DefaultApolloStore$readOperation$2.invoke(DefaultApolloStore.kt:106)
       at com.apollographql.apollo3.cache.normalized.internal.DefaultApolloStore$readOperation$2.invoke(DefaultApolloStore.kt:105)
       at com.apollographql.apollo3.cache.normalized.internal.Guard.readAccess(Guard.java:12)
       at com.apollographql.apollo3.cache.normalized.internal.DefaultApolloStore.readOperation(DefaultApolloStore.kt:105)
       at com.apollographql.apollo3.cache.normalized.internal.ApolloCacheInterceptor.readFromCache(ApolloCacheInterceptor.kt:181)
       at com.apollographql.apollo3.cache.normalized.internal.ApolloCacheInterceptor.access$readFromCache(ApolloCacheInterceptor.kt:31)
       at com.apollographql.apollo3.cache.normalized.internal.ApolloCacheInterceptor$interceptQuery$1.invokeSuspend(ApolloCacheInterceptor.kt:166)
       at com.apollographql.apollo3.cache.normalized.internal.ApolloCacheInterceptor$interceptQuery$1.invoke(ApolloCacheInterceptor.kt:1)
       at kotlinx.coroutines.flow.SafeFlow.collectSafely(SafeFlow.java:61)
       at kotlinx.coroutines.flow.AbstractFlow.collect(Flow.kt:227)
       at kotlinx.coroutines.flow.internal.ChannelFlowOperatorImpl.flowCollect(ChannelFlow.kt:195)
       at kotlinx.coroutines.flow.internal.ChannelFlowOperator.collectTo$suspendImpl(ChannelFlow.kt:157)
       at kotlinx.coroutines.flow.internal.ChannelFlowOperator.collectTo(ChannelFlow.kt:1)
       at kotlinx.coroutines.flow.internal.ChannelFlow$collectToFun$1.invokeSuspend(ChannelFlow.kt:60)
       at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
       at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
       at kotlinx.coroutines.internal.LimitedDispatcher.run(LimitedDispatcher.kt:39)
       at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:95)
       at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
BoD commented 2 years ago

Hi! 👋 Thank you for reporting this!

From what I am gathering online it's not 100% clear whether this can happen when individual rows are too big, or if it can also be related to the total size of the query. Does your app clears the SQL cache (with [clearAll()](https://apollographql.github.io/apollo-kotlin/kdoc/apollo-normalized-cache-sqlite/com.apollographql.apollo3.cache.normalized.sql/-sql-normalized-cache/clear-all.html?query=open%20fun%20clearAll())) at all?

Also, this doesn't address the root cause, but if you update to v3.2.1 or later (latest is 3.2.2), such exceptions will be caught and treated as a cache miss - which is of course not perfect but better than crashing. The exception will be logged by default but you can have a custom tracking by setting apolloExceptionHandler

asimonigh commented 2 years ago

Hi @BoD,

Thanks for your reply. My app clear the cache only on logout, do you advise to provide a way to clear the cache manually if user has problem or call it automatically for some cases ? I will update to the latest version and track this exception as a "non-fatal".

BoD commented 2 years ago

Clearing it at logout sounds good to me 👍. Even though I suspect most users don't logout often (well that depends on the app of course). The reason I asked is that right now there aren't ways to automatically remove old entries from the cache (that's on the roadmap) so in the current situation it could grow forever if it is never manually cleaned - but to be honest that would still not explain the exception you're seeing, it's just something to think about.

I will update to the latest version and track this exception as a "non-fatal".

👍

Mordred commented 1 year ago

@BoD For me the root cause of this error is QUERY_ROOT record. I run into this because we have few bigger mutations where this mutations accept very large inputs (e.g. base64 binary data). These inputs are then encoded as a keys for QUERY_ROOT.

e.g.:

2|QUERY_ROOT|{"me":"ApolloCacheReference{Me:2}","groups":["ApolloCacheReference{Group:2b638e24-5bc8-4762-a0cd-f51a3eeb5ff8}","ApolloCacheReference{Group:4394b841-e7d7-433b-befd-510ade7660d8}","ApolloCacheReference{Group:7d0f19ab-2738-4606-b893-1d538f818a19}","ApolloCacheReference{Group:6d189c6a-9f3c-482d-afdd-f82d2b395ed5}","ApolloCacheReference{Group:6635c996-9d4d-4a32-97b5-369fcd5d7eb0}","ApolloCacheReference{Group:753f4901-149e-4ef1-a05a-33878433832d}","ApolloCacheReference{Group:7071ea10-9ed4-4aa7-96b8-5de4fa379d86}","ApolloCacheReference{Group:37d890e5-53ed-4725-9a3f-57e3c4b9cfc9}","ApolloCacheReference{Group:156abfb5-0926-4a89-9a62-3546d08a278e}","ApolloCacheReference{Group:70f7c22d-4dab-4e0c-bd64-c61c55f50b1e}","ApolloCacheReference{Group:61e7c63c-fcd7-40a9-9d3d-2185439ac66f}","ApolloCacheReference{Group:ae133fc4-dd65-4c09-b011-c857602cfae1}","ApolloCacheReference{Group:4a6c9c29-6641-4cf9-9b13-123ba6ab3fba}","ApolloCacheReference{Group:61ca3d82-a623-4a57-b066-5908e4ecfbd0}","ApolloCacheReference{Group:a87eabd7-b2c1-40dc-ad39-80f55a4f2349}","ApolloCacheReference{Group:f95fe812-0c8b-4f11-b5aa-b84289fb35f0}","ApolloCacheReference{Group:65df374b-8ae7-48bb-b76e-a4bfa29b5ed7}","ApolloCacheReference{Group:8137854c-0ec1-4ad6-97d7-262b39551037}","ApolloCacheReference{Group:0a516d44-1475-41f2-9ff2-f0cbd892e2aa}","ApolloCacheReference{Group:cf21428a-3b33-41b4-8c74-2488c6c1ea6d}","ApolloCacheReference{Group:6bffdcb3-a2c3-4dfc-963b-881afd0f6704}","ApolloCacheReference{Group:b46de322-d65c-456e-aec6-fa4eec2df8d7}","ApolloCacheReference{Group:b5791639-d9bc-4b36-9735-4d56cfa33c68}","ApolloCacheReference{Group:ab562dd8-545a-4cbe-a1c5-0784ed53f16f}","ApolloCacheReference{Group:54908ee8-2a5d-4591-9eb4-f399fc8d40ce}","ApolloCacheReference{Group:c66778a2-2642-4d80-9eea-c97206a75dfa}","ApolloCacheReference{Group:567b30af-da7c-49d6-8733-d70a637d39da}","ApolloCacheReference{Group:5d7e229e-1624-49f8-abe9-c0ee503fd5ad}","ApolloCacheReference{Group:b59525e2-1ca6-4f68-9d54-c25f0f606c85}","ApolloCacheReference{Group:dcb5af5e-696a-4494-87a6-b31948e8fbdb}","ApolloCacheReference{Group:145ab338-72f5-45af-a05a-de757b7d5567}","ApolloCacheReference{Group:dcf5c3c4-f476-46bc-8366-e6e97fdad6c1}","ApolloCacheReference{Group:5066f06f-28e9-4cf0-8af5-f8e99b909541}"],"group({\"id\":\"dcf5c3c4-f476-46bc-8366-e6e97fdad6c1\"})":"ApolloCacheReference{Group:dcf5c3c4-f476-46bc-8366-e6e97fdad6c1}","group({\"id\":\"145ab338-72f5-45af-a05a-de757b7d5567\"})":"ApolloCacheReference{Group:145ab338-72f5-45af-a05a-de757b7d5567}","createGroupPhoto({\"input\":{\"data\":\"/9j/4T/+RXhpZgAATU0AKgAAAAgACwEPAAIAAAAGAAAAkgEQAAIAAAAHAAAAmAESAAMAAAABAAMAAAEaAAUAAAABAAAAoAEbAAUAAAABAAAAqAEoAAMAAAABAAIAAAExAAIAAAAGAAAAsAEyAAIAAAAUAAAAtgITAAMAAAABAAEAAIdpAAQAAAABAAAAyoglAAQAAAABAAAD3gAABRRBcHBsZQBpUGFkIDIAAAAAAEgAAAABAAAASAAAAAE5LjMuNQAyMDIzOjAxOjE4IDE3OjMyOjI5AAAfgpoABQAAAAEAAAJEgp0ABQAAAAEAAAJMiCIAAwAAAAEAAgAAiCcAAwAAAAECgAAAkAAABwAAAAQwMjIxkAMAAgAAABQAAAJUkAQAAgAAABQAAAJokQEABwAAAAQBAgMAkgEACgAAAAEAAAJ8kgIABQAAAAEAAAKEkgMACgAAAAEAAAKMkgQACgAAAAEAAAKUkgcAAwAAAAEABQAAkgkAAwAAAAEAIAAAkgoABQAAAAEAAAKcknwABwAAAPQAAAKkkpEAAgAAAAQ0NjgAkpIAAgAAAAQ0NjgAoAAABwAAAAQwMTAwoAEAAwAAAAEAAQAAoAIABAAAAAEAAALQoAMABAAAAAEAAALQohcAAwAAAAEAAgAAowEABwAAAAEBAAAApAIAAwAAAAEAAAAApAMAAwAAAAEAAAAApAUAAwAAAAEAOwAApAYAAwAAAAEAAAAApDIABQAAAAQAAAOYpDMAAgAAAAYAAAO4pDQAAgAAACAAAAO+AAAAAAAAAAEAAAAPAAAADAAAAAUyMDIzOjAxOjE4IDE3OjMyOjI5ADIwMjM6MDE6MTggMTc6MzI6MjkAAAAYwQAABlYAABLtAAAHfv//9rcAAAbgAAAAAAAAAAEAAADLAAAAZEFwcGxlIGlPUwAAAU1NAAgAAQAJAAAAAQAAAAQAAwAHAAAAaAAAAHQABAAJAAAAAQAAAAAABQAJAAAAAQAAAMgABgAJAAAAAQAAAHwABwAJAAAAAQAAAAEACAAKAAAAAwAAANwAFAAJAAAAAQAAAAEAAAAAYnBsaXN0MDDUAQIDBAUGBwhZdGltZXNjYWxlVXZhbHVlVWVwb2NoVWZsYWdzEjuaygATAAMIuj+yoXcQABABCBEbISctMjs9AAAAAAAAAQEAAAAAAAAACQAAAAAAAAAAAAAAAAAAAD8AAAccAAAHSwAAAUwAAOiJ///38QAANXEAAADLAAAAZAAAAMsAAABkAAAADAAAAAUAAAAMAAAABUFwcGxlAGlQYWQgMiBiYWNrIGNhbWVyYSAyLjAzbW0gZi8yLjQAAA8AAQACAAAAAk4AAAAAAgAFAAAAAwAABJgAAwACAAAAAkUAAAAABAAFAAAAAwAABLAABQABAAAAAQAAAAAABgAFAAAAAQAABMgABwAFAAAAAwAABNAADAACAAAAAksAAAAADQAFAAAAAQAABOgAEAACAAAAAlQAAAAAEQAFAAAAAQAABPAAFwACAAAAAlQAAAAAGAAFAAAAAQAABPgAHQACAAAACwAABQAAHwAFAAAAAQAABQwAAAAAAAAAMAAAAAEAAAAIAAAAAQAACQkAAABkAAAAEQAAAAEAAAAMAAAAAQAAEJgAAABkAAC1kwAAAVMAAAAQAAAAAQAAACAAAAABAAAG4wAAAGQAAAAAAAAAAQAAJK8AAAAeAAAkrwAAAB4yMDIzOjAxOjE4AAAAAABBAAAAAQAGAQMAAwAAAAEABgAAARoABQAAAAEAAAViARsABQAAAAEAAAVqASgAAwAAAAEAAgAAAgEABAAAAAEAAAVyAgIABAAAAAEAADNEAAAAAAAAAEgAAAABAAAASAAAAAH/2P/bAIQAAQEBAQEBAgEBAgMCAgIDBAMDAwMEBQQEBAQEBQYFBQUFBQUGBgYGBgYGBgcHBwcHBwgICAgICQkJCQkJCQkJCQEBAQECAgIEAgIECQYFBgkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJ/90ABAAK/8AAEQgAoACgAwEiAAIRAQMRAf/EAaIAAAEFAQEBAQEBAAAAAAAAAAABAgMEBQYHCAkKCxAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTV....

For now my fix was to set doNotStore(true) for these kind of mutations, but it is only a temporary solution. We need way how to auto expire QUERY_ROOT keys.

Btw: on Android this exception is handled as you wrote with cache miss .. but on iOS it will load to memory and cause OOMs errors.

BoD commented 1 year ago

Thanks a lot for the feedback @Mordred, that's very valuable. That would explain it indeed. I'm guessing caching these mutations probably doesn't make a lot of sense anyway?

Mordred commented 1 year ago

Yes, but this can be a problem for queries too .. e.g. query which accept filter and pagination cursor.

Is it possible to detect this and clear unused data automatically or manually?

BoD commented 1 year ago

@Mordred You can use ApolloStore.remove to remove a specific key manually.

For automatic detection, there is an experimental TrimmableNormalizedCache in the 'incubating' normalized cache artifacts. This is not ready for production, but if you'd like to give it a try, you can replace your apollo-normalized-cache-xyz dependencies by apollo-normalized-cache-xyz-incubating.

While I'm in the experimental/incubating area, let's also mention an improved support for pagination (several pages can be stored at a unique cache key) which you may also find interesting. Sorry there is no documentation on this yet but an example can be seen here.