sanity-io / sanity-algolia

Utilities for indexing Sanity documents in Algolia
MIT License
67 stars 16 forks source link

Is it possible to split a sanity document into multiple algolia records? #16

Open ant1m4tt3r opened 3 years ago

ant1m4tt3r commented 3 years ago

For index performance reasons, I would like to split a sanity document into multiple algolia records, specially because of a large field related with a blog post.

Reading the API code this does not seem to be possible right now, since the the transform/serialize function (the second argument of sanity-algolia exported function) only allow the return of a single algolia record, not multiple.

Maybe this could be simply addressed with a small amount of code on the library side and I would gladly try to implement it if the community judges it as ok.

ant1m4tt3r commented 3 years ago

More thoughts about it: To me it seems that the difficult part of this is to assign a new objectId to each individual document that was split and make sure that sanity can still track it. Not sure how to do that yet, but I thought about using a field in sanity for such idea, but that would create two problems that I can predict 1) A new field on sanity that does not belong to the business domain of the document 2) Since this implementation is designed to run on a webhook, updating a sanity document on it would create a cascade of recursive webhook calls

judofyr commented 3 years ago

One possible way to model this in the existing API would be to support "virtual" types:

  const sanityAlgolia = indexer(
    {
      post: {
        index: algoliaIndex,
        projection: `{
          title,
          "path": slug.current,
          "body": pt::text(body)
        }`,
      },
      minimalPost: {
        index: algoliaIndex,
        type: "post",
        projection: `{
          title,
          "path": slug.current
        }`,
      },
      article: {
        index: algoliaIndex,
        projection: `{
          heading,
          "body": pt::text(body),
          "authorNames": authors[]->name
        }`,
      },
    },

    (document: SanityDocumentStub) => {
      switch (document._type) {
        case 'post':
          return Object.assign({}, document, {
            custom: 'An additional custom field for posts, perhaps?',
          })
        case 'article':
          return {
            title: document.heading,
            body: document.body,
            authorNames: document.authorNames,
          }
        default:
          return document
      }
    }
  )

Here we define a virtual type minimalPost which maps to the "real" Sanity type post. In the serializer (the second argument) we'll set it to _type: "minimalPost" even though it's stored as _type: "post" in Sanity. For these virtual objects we'll also have use a specialized object ID (e.g. ${doc._id}_${doc._type}).

solace commented 9 months ago

Has this been officially addressed yet? I see #13 and #14, and it looks like #14 made it into -alpha and then was removed.

Thanks!