sanity-io / sanity-algolia

Utilities for indexing Sanity documents in Algolia
MIT License
67 stars 16 forks source link

initial content indexing? #12

Closed ovsw closed 3 years ago

ovsw commented 3 years ago

Since the function only runs on a page publish hook, how to go about doing the initial indexing of all content in sanity? Thanks!

podlebar commented 3 years ago

hey.. i did it with the export function from sanity: https://www.sanity.io/docs/export

another way is to do it via CLI.. just add a additional field to your documents and call it "lastIndexed" with the current date in it for example.. when you update all documents via CLI they will be reindexed to Algolia.. after they landed in the Algolia index you could remove the field again. Sadly the adding and removing of the field will cost you some units as it will be one operation for each dataset on add and again one operation when you remove the field.

runeb commented 3 years ago

One simple way would be to fetch a list of relevant existing document _ids and then invoke this module manually


const types = ["all", "the", "relevant", "types"]

client.fetch(`* [_type in $types && !(_id in path("drafts.**"))][]._id`, { types }).then(created => 
  sanityAlgolia.webhookSync(client, { ids: { created, updated: [], deleted: [] }})
)
Ojay commented 3 years ago

Hey @runeb - thanks for this.

I've seen an example of initial indexing of existing content is on the to do list for this plugin, and the above is definitely a step toward that.

Unfortunately I'm getting the following console error with my stripped down attempt...

details: {
    description: 'param $updated referenced, but not provided',
    end: 37,
    query: '* [(_id in $created || _id in $updated) && _type in $types] {\n' +
    '  _id,\n' +
    '  _type,\n' +
    '  _rev,\n' +
    '  _type == "resources" => {\n' +
    '                        res_name,\n' +
    '                        "path": slug.current\n' +
    '                    }\n' +
    '}',
    start: 30,
    type: 'queryParseError'
}

Stripped down code example...

const sanityAlgolia = indexer(
            {
                resources: {
                    index: algoliaIndex,
                    projection: `{
                        res_name,
                        "path": slug.current
                    }`,
                }
            },

            document => {
                console.log(document._type)
            }

        );

        const types = ["resources"]

        return sanity.fetch(`* [_type in $types && !(_id in path("drafts.**"))][]._id`, { types }).then(created => 
            sanityAlgolia.webhookSync(sanity, { ids: { created }})
            // console.log(created)
        ).then(() => res.status(200).send('ok'));

Any pointers on that, along with an example robust solution, would be really helpful! Thanks for your hard work on this.

Mark

runeb commented 3 years ago

@Ojay Ah, that was just because the code expects updated, created and deleted properties and my example only had created.

runeb commented 3 years ago

Adding example to README

runeb commented 3 years ago

https://github.com/sanity-io/sanity-algolia/blob/main/README.md#first-time-indexing

Ojay commented 3 years ago

Thank you so much! You guys are doing such good work.