typesense / firestore-typesense-search

Firebase Extension to automatically push Firestore documents to Typesense for full-text search with typo tolerance, faceting, and more
https://extensions.dev/extensions/typesense/firestore-typesense-search
Apache License 2.0
151 stars 27 forks source link

JavaScript heap out of memory #13

Open uafrontender opened 2 years ago

uafrontender commented 2 years ago

I have >40k records in firestore and I got issue when I try to index more than 3 fields. Please check screenshots

Selection_009 Selection_010 Selection_011 Selection_012

jasonbosco commented 2 years ago

@mkarkachov This could be related to #10. May I know what version of the extension you're running?

uafrontender commented 2 years ago

@mkarkachov This could be related to #10. May I know what version of the extension you're running?

typesense/firestore-typesense-search@0.2.5

jasonbosco commented 2 years ago

Hmmm. Could you share a sample firestore document? I'm curious to know the size of each document particularly and the data types.

matslb commented 2 years ago

I get the same error

jasonbosco commented 2 years ago

@matslb Could you share a sample firestore document?

matslb commented 2 years ago

Here is a typical document in my firestore:

{
    "Sugar": "<3",
    "SubType": "Hvitvin",
    "RatingFetchDate": 1639440356127,
    "Id": "7855101",
    "ProductStatusSaleName": "Utsolgt",
    "PriceIsLowered": true,
    "SortingDiscount": 87.87,
    "RatingComment": "En tørr og frisk litt bitter vin som tåler 3-4 år. Godt kjøp.",
    "Stock": {
        "Stores": [
            {
                "pointOfService": {
                    "displayName": "Oslo, Tveita",
                    "id": "145",
                    "name": "145"
                },
                "stockInfo": {
                    "stockLevel": 21
                }
            }
        ]
    },
    "LatestPrice": 99.9,
    "Literprice": 134,
    "PriceChange": 66.64,
    "PriceChanges": 5,
    "LastUpdated": 1638662400000,
    "Stores": [
        "145"
    ],
    "ComparingPrice": 149.9,
    "Volume": 0.75,
    "PriceHistory": {
        "1625090400000": 149.9,
        "1638662400000": 99.9,
        "1598911200000": 169.7,
        "1594072800000": 169.6,
        "1609452000000": 170.6
    },
    "Rating": 84,
    "Country": [
        "Portugal"
    ],
    "Alcohol": 12.5,
    "Types": [
        "Hvitvin"
    ],
    "Type": "Svakvin",
    "Acid": "6,3",
    "LiterPriceAlcohol": 1072,
    "Description": {
        "recommendedFood": [
            {
                "foodDesc": "Skalldyr",
                "foodId": "B"
            },
            {
                "foodDesc": "Fisk",
                "foodId": "C"
            },
            {
                "foodId": "D",
                "foodDesc": "Lyst kjøtt"
            }
        ],
        "sweetness": "01",
        "fullness": "05",
        "bitterness": "",
        "characteristics": {
            "odour": "Lukter av friske sitronaromaer og noe grapefrukt.",
            "colour": "Lys gyllengul.",
            "taste": "Frisk syre med god frukt. God konsentrasjon."
        },
        "freshness": "07",
        "tannins": ""
    },
    "Name": "Quinta de Chocapalha Arinto 2017",
    "ManufacturerName": "Quinta de Chocapalha",
    "Discount": 66.6444296197465
}

These are the fields being indexed by Typesense, set in the extension configuration in firebase:

The fields match the collection schema

0x80 commented 8 months ago

Reading the extension code I think heap issues are also to be expected in the backfill function, as it first loads the complete set of documents from the firestore collection before it starts batching documents and sending them to the typesense API.

I work on a project with a users collection that is nearing 1M records, containing user profile info and application settings. You do not want to load all of that upfront to the heap before you start processing chunks.

I wrote utility functions to deal with these kinds of scenarios, but I haven't gotten around to packaging and releasing them to the public.