firebase / firebase-js-sdk

Firebase Javascript SDK
https://firebase.google.com/docs/web/setup
Other
4.77k stars 876 forks source link

Cache does not seem to be effective in tests #8025

Open opack opened 5 months ago

opack commented 5 months ago

Operating System

Windows 11

Browser Version

N/A

Firebase SDK Version

10.7.2

Firebase SDK Product:

Firestore

Describe your project's tooling

Vitest 1.2.2, Firebase Emulator

Describe the problem

Hi! The getDoc method always try to retrieve up-to-date data from the server, and read the cache only if offline. I need a method that tries first to retrieve the data from cache. So I created a method named getDocPreferablyFromCache to try the cache and fallback to server if the cache does not contain the data. When trying to test this method, I run into a strange problem when running the test with Vitest: cache seems to never be popuplated... However, it seems to work fine in the browser.

The doc is not clear about when the cache is filled with data, but I expect it happens when I issue a getDoc or getDocFromServer. But it does not seem to work that way.

Am I missing something or is there a bug?

Steps and code to reproduce issue

firestore.rules

rules_version = '2';

service cloud.firestore {
  match /databases/{database}/documents {
    match /tests/{document} {
      allow read, write: if true;
    }
  }
}

A test reproducing the issue:

test('getDoc from various sources', async () => {
    const app = initializeApp({
        apiKey: PUBLIC_FIREBASE_API_KEY,
        authDomain: PUBLIC_FIREBASE_AUTH_DOMAIN,
        projectId: PUBLIC_FIREBASE_PROJECT_ID,
        storageBucket: PUBLIC_FIREBASE_STORAGE_BUCKET,
        messagingSenderId: PUBLIC_FIREBASE_MESSAGING_SENDER_ID,
        appId: PUBLIC_FIREBASE_APP_ID
    })
    const db = initializeFirestore(app, { localCache: memoryLocalCache() })
    connectFirestoreEmulator(db, '127.0.0.1', 8080)

    const docPath = 'tests/doc'

    await setDoc(doc(db, docPath), {
        anything: 'is great'
    })

    const fromAnywhere = await getDoc(doc(db, docPath))
    console.log('fromAnywhere was from cache?', fromAnywhere.metadata.fromCache)
    expect(fromAnywhere.metadata.fromCache).toBe(false)

    const fromServer = await getDocFromServer(doc(db, docPath))
    console.log('fromServer was from cache?', fromServer.metadata.fromCache)
    expect(fromServer.metadata.fromCache).toBe(false)

    const fromCache = await getDocFromCache(doc(db, docPath))
    console.log('fromCache was from cache?', fromCache.metadata.fromCache)
    expect(fromCache.metadata.fromCache).toBe(true)
})

The output in the console:

fromAnywhere was from cache? false
fromServer was from cache? false

FirebaseError: Failed to get document from cache. (However, this document may exist on the server. Run again without setting 'source' in the GetOptions to attempt to retrieve the document from the server.)

I thought that maybe the cached was being filled only when the data was accessed often (as some sentence suggested that in the doc) so I put a loop aroud the getDoc, but even with 1000 operations, it does not change anything. I also went into the source code to try to understand how the SDK works, but I did not manage to see where the cache was written (I only found cache reads).

MarkDuckworth commented 5 months ago

In the following code, the SDK is being initialized with memory local cache using eager garbage collection.

const db = initializeFirestore(app, { localCache: memoryLocalCache() })

Memory local cache means cached documents are being stored in memory rather than on disk (aka persistent local cache). Eager garbage collection is the default garbage collection for memory local cache, which means that documents will be cleared from the cache when the SDK no longer has an open query for them. Or in other words, with eager garbage collection, the document will be removed from cache immediately after a call to getDoc(...), because the query is complete. Alternatively, the document would be held in cache while there is an active onSnapshot(...) listener that queries for the doc.

To change the garbage collection behavior for memoryLocalCache, you can pass settings. It should look like this.

const db = initializeFirestore(app, { localCache: memoryLocalCache({garbageCollector: memoryLruGarbageCollector()}) })

The LRU garbage collector will cause documents to stay in cache until the cache fills to the storage limit (default 40MB), even if there is not an active snapshot listener. Because it's memory persistence, the cache will still clear when the application reloads.

opack commented 4 months ago

Thanks for the clarification! I do not have access to my project right now, so I'll have to wait until Tuesday to check this out, but I'll definitely try this! Is there a doc that I missed where this behavior is explained? Maybe it contains more info that I should read on this matter 😉

MarkDuckworth commented 4 months ago

The relevant API reference docs are memoryLocalCache and memoryLruGarbageCollector.

The ability to configure LRU garbage collection for memory persistence is relatively new, so I'm going to look into updating this page about Accessing data offline.

opack commented 4 months ago

Many thanks for the solution and explanations! 🙏

I don't know if you'd rather leave this issue opened to keep track of this doc update, so I won't close it, but feel free to close it if you will 😉