Closed henryteng07 closed 1 month ago
When installing the extension, you'll find the option to specify which fields are synced from Firestore to Typesense in the extension configuration.
See the 2nd parameter here: https://github.com/typesense/firestore-typesense-search/blob/master/assets/extension_configuration_example.png
I tried that first but indexOnWrite still seems to be running when voteCount is incremented. Do I create a new extension for it to work?
I also set index to false in my Typesense schema
Oh I see what you mean now. indexOnWrite
will always get triggered by Firebase, since it's a document change listener trigger. But within the trigger function, ~we only make an API call to Typesense to sync the data, if the fields you've specified in the extension configuration have changed.~ See below.
I don't think there's a way to setup field-level change triggers in Firebase.
Thanks for clarifying! My goal is to only sync Typesense data when certain fields have changed. That, as you mentioned, can be done in the extension config.
P.S. Do I still leave voteCount in the database schema if its not in the extension config fields?
No harm in leaving voteCount in the schema since it's an optional field, but you might as well remove it if you're not sending that data over to Typesense.
Sorry to bother you again but why is the doc still upserted to Typesense even though the updated field "voteCount" isn't in the extension config?
My bad, I mis-spoke earlier. I just refreshed my memory looking at the code and confirmed this - the upsert will always happen in the extension, but the actual logic of whether a field should be re-indexed or not lives inside Typesense server, not in the extension - meaning that if Typesense received a document to update with the same values that it already has, then it will just ignore that update.
But the extension itself will always make the API call. What the fields configuration in the extension does is, it lets you pick if you want to send the full document in the update API call, or just a subset of fields from your document, regardless of whether they've changed or not.
Thanks again! So I'll be charged a Firebase function invocation regardless but the API call will only update the Typesense document if the fields in the config are updated.
I'm still unsure whether to include voteCount in Typesense. I want to sort search results by "most popular", but 1000 votes on a post would equal 1000 Typesense doc updates on top of 1000 Firestore update requests. Are there any solutions you'd recommend? Perhaps a way to batch updates?
You could exclude the voteCount from the official extension, but write a separate scheduled cloud function yourself that periodically looks at all records updated since the last time the scheduled function ran, and then bulk updates all the changed records where voteCount changed in a single API call.
I managed to get the cloud function up and running. But when I check the doc inside Typesense, it doesn't have the voteCount field nor the rest of my postData. I've tried upsert, update, and even emplace. The doc only has fields I've mentioned inside the extension.
exports.scheduledTypesenseUpdate = functions.pubsub
.schedule("every 5 minutes")
.onRun(async (context) => {
// get last run timestamp
const lastRunDoc = await admin
.firestore()
.collection("metadata")
.doc("lastRun")
.get();
const lastRun = lastRunDoc.exists
? lastRunDoc.data().timestamp
: admin.firestore.Timestamp.now();
// get posts updated since last run
const postsSnapshot = await admin
.firestore()
.collection("posts")
.where("updatedAt", ">", lastRun)
.get();
// for each post, check if voteCount has changed
// if it has, add it to the list of posts to update in Typesense
const postsToUpdate = [];
postsSnapshot.forEach((doc) => {
const postData = doc.data();
if (postData.voteCount !== postData.voteCountInTypesense) {
postsToUpdate.push({
id: doc.id,
...postData,
voteCountInTypesense: postData.voteCount,
createdAt: postData.createdAt.toMillis(), // convert Timestamp to int64
updatedAt: postData.updatedAt.toMillis(), // convert Timestamp to int64
});
}
});
console.log("Posts to update:", postsToUpdate.length);
// update posts in Typesense
if (postsToUpdate.length > 0) {
try {
await typesenseClient
.collections("posts")
.documents()
.import(postsToUpdate, { action: "update" });
} catch (error) {
console.error("Error updating posts in Typesense! :", error);
}
// update voteCountInTypesense in Firestore for each updated post
const batch = admin.firestore().batch();
postsToUpdate.forEach((post) => {
const postRef = admin.firestore().collection("posts").doc(post.id);
batch.update(postRef, { voteCountInTypesense: post.voteCount });
});
await batch.commit();
}
// update last run timestamp
await admin
.firestore()
.collection("metadata")
.doc("lastRun")
.set({ timestamp: admin.firestore.Timestamp.now() });
});
Did you add the voteCount field back to the Typesense Collection schema after removing it here: https://github.com/typesense/firestore-typesense-search/issues/79#issuecomment-2123393924
I found out whats causing the issue!
I'm updating my firestore doc in my cloud function (voteCountInTypesense field). That triggers the plugin which updates only the mentioned fields and removes the rest, undoing the work of the cloud function.
But that doesn't make sense since you mentioned the API will only update if changes are made to the fields mentioned in the plugin config.
After some testing, I'm pretty sure the plugin is updating the Typesense doc even if the field that's updated in firestore was not mentioned in the plugin.
This means everytime user updates voteCount, typesense resets to only the plugin fields.
You're right - the plugin does an upsert
(not an update
) which requires the full document to be sent and any fields not specified in the upsert will be removed from the document.
At this point, since you have a custom function going, I think it might be best to write your own function handler that also sends updates (instead of upserts), instead of using this extension.
Description
I want to prevent Typesense extension from running indexOnWrite when a specific field inside the doc is updated. For example, I have a likeCount field in my post document which gets updated whenever a user likes the post. Right now the doc in Typesense is updating on each user like which is not efficient. Is there a way to tell Typesense not to update its doc for changes to certain fields?