algolia / firestore-algolia-search

Apache License 2.0
112 stars 35 forks source link

ReIndex Script Fails #117

Closed dutchkillscreative closed 11 months ago

dutchkillscreative commented 2 years ago

Hi I've previously had success on prior versions of the extension manually running the reindexing script demonstrated in the below video.

https://youtu.be/ZNVAPpTpKpk?t=1018

However since the update it seems this reindexing operation has been updated to be a fully CLI run function instead of having to create and run a separate file. This is great news however I'm having a timeout error when indexing my users database. There's maybe somewhere in the area of 4200 documents in that collection and I'm just putting a 6 fields of mostly strings or arrays with string values inside. (username, userID, photoURL, etc.) so I feel like that shouldn't be too much data to process. Most of my records are averaging somewhere in the area of 600bytes of data.

I see a terminal full of records being processed but eventually this process times out and the error message I see at the end is displayed. Ignore the values with _ delimiters as I changed them for privacy reasons

\"lastmodified\":{\"_operation\":\"IncrementSet\",\"value\":[NOT_SURE_IF_PRIVATE]}}}]}","headers":{"x-algolia-api-key":"*****","x-algolia-application-id":"[MY_ALGOLIA_APP_ID]","content-type":"application/x-www-form-urlencoded"},"method":"POST","url":"https://[MY_ALGOLIA_APP_ID].algolia.net/1/indexes/users/batch?x-algolia-agent=Algolia%20for%20JavaScript%20(4.13.1)%3B%20Node.js%20(16.15.1)%3B%20firestore_integration%20(0.5.13)","connectTimeout":2,"responseTimeout":30},"response":{"status":429,"content":"{\"message\":\"Too many requests\",\"status\":429}","isTimedOut":false},"host":{"protocol":"https","url":"[MY_ALGOLIA_APP_ID].algolia.net","accept":2},"triesLeft":3}],"severity":"ERROR"}

I've tried a few times with all similar results, it seems to get to the same point roughly and fail. I'm hesitant to continue to run it because I assume it's firing either a ton of reads on my firestore db and/or a ton of indexing operations on my algolia database.

Like I said I had success in the previous version of the extension but this CLI tool seems to be failing. Theres also a brief warning about environment variables not set up properly just before the indexing operation fires? The terminal log does not go back far enough to copy it. I believe it is similar to the warning given at this point in the video but I can not confirm. https://youtu.be/ZNVAPpTpKpk?t=1351 Also I assume the path name for my service account would not change based on this CLI function versus that in the video. I have my service account set up in the top level of my app folder so I list the path as './sa.json' without quotes.

dutchkillscreative commented 2 years ago

Update: Apparently my app is too successful because it's hitting the rate limit quota for the algolia index Max API calls/IP/hour: 100.

Raising this value above my total document count temporarily seemed to resolve the issue. Suggest that since we're agreeing to this warning in advance of the operation:

WARNING: The back fill process will index your entire collection which will impact your Search Operation Quota. Please visit https://www.algolia.com/doc/faq/accounts-billing/how-algolia-count-records-and-operation/ for more details. Do you want to continue? (y/N): y

that the max quota be lifted for the ip initiating the indexing operation. Let me know how you want to resolve this issue whether to keep up so it's informative to other users or resolve.

Thanks.

smomin commented 2 years ago

@Haroenv what are your thoughts on setting the quota during the import process?

dutchkillscreative commented 2 years ago

@Haroenv what are your thoughts on setting the quota during the import process?

Seems like a good workable solution to me. Wouldn’t take much effort to get the total # of documents in a collection one time. Getting large enough I should be tracking the collection with a counter anyways but I assume for most people it’s probably best to have a limit or upper range.

smomin commented 11 months ago

I am closing this issue since we are not using CLI for reindexing.