vendure-ecommerce / vendure

The commerce platform with customization in its DNA.
https://www.vendure.io
Other
5.66k stars 1.01k forks source link

Make the CollectionService event subscribers configurable to prevent `apply-collection-filters` job being added for all collections #2733

Open seminarian opened 7 months ago

seminarian commented 7 months ago

Is your feature request related to a problem? Please describe. Some facts our system:

At the moment, we have a delay of more than 10 minutes + the execution time of the apply-collection-filters task, and only after that, the indexes for the products that were in the buffer begin to update.

In Vendure Core in CollectionService the following event subscribers are set up:

      const productEvents$ = this.eventBus.ofType(ProductEvent);
      const variantEvents$ = this.eventBus.ofType(ProductVariantEvent);

        merge(productEvents$, variantEvents$)
            .pipe(debounceTime(50))
            // eslint-disable-next-line @typescript-eslint/no-misused-promises
            .subscribe(async event => {
                const collections = await this.connection.rawConnection
                    .getRepository(Collection)
                    .createQueryBuilder('collection')
                    .select('collection.id', 'id')
                    .getRawMany();
                await this.applyFiltersQueue.add({
                    ctx: event.ctx.serialize(),
                    collectionIds: collections.map(c => c.id),
                });
            });

Every update to a product/productvariant causes the whole computation of this collection_product_variants_product_variant join table. Basically 16 million rows are calculated and added again.

Since we only use facetValueCollectionFilters we would like to implement in our Vendure codebase following proposal:

I believe that whenever one Product or ProductVariant is updated it should be computed which entries for that specific variant(s) should be in the collection_product_variants_product_variant join table. Let's call this calculateCollectionsForVariant Proposal for our code-base (because we only use FacetValuesFilter) but I think this could probably be expanded: calculateCollectionsForVariant should happen: When creating/updating a variant. (We might even check if only facetValues were updated and calculate the difference for the changed facetValues. Since a lot of the information for doing this is stored in filters in JSON stored in a text field in the collection table we will probably need a new table with following columns: collectionId facetValueId This new table should be populated (sync) upon: Collection create Collection update

This would all be possible by overriding some resolvers (ProductVariant and Collection) though the only thing stopping us are those event listeners that are set up in CollectionService.

Describe the solution you'd like An option to disable these EventSubscribers from within vendure-config.ts

Describe alternatives you've considered Patching our deployment of Vendure to disable those.

Additional context image

floze commented 5 months ago

I would like to second this proposal. We have one order of magnitude more collections than @seminarian and it's a one huge pain to be frank.

monrostar commented 2 months ago

@michaelbromley I would like to continue working on this Issue. I made a small performance update, but I would also like to change and optimize the filters. At the moment this solution is not suitable for a large volume of products. When there are millions of variants in the database.

Here is my PR https://github.com/vendure-ecommerce/vendure/pull/2978

monrostar commented 2 months ago

@michaelbromley I would like to continue working on this Issue. I made a small performance update, but I would also like to change and optimize the filters. At the moment this solution is not suitable for a large volume of products. When there are millions of variants in the database.

Here is my PR #2978

Most importantly I would prefer to remove applyToChangedVariantsOnly because it updates all variants, although we always need to update only those variants that have been removed or added even if the collection was updated

monrostar commented 2 months ago

We could search for collection's slugs or facets separately and create a separate index for them and not combine these data. Due to collection updates, collisions may occur with a large number of updates or task lists that may conflict with each other

I had to remove collectionSlug from our index to avoid such problems, and also to avoid storing any data about other entities. I just extended the search for slugs through a separate query and then passed only collectionIds to the index query

The index should contain only data and relations for product and productVariant

monrostar commented 2 months ago

I also want to show what mapping we use in our Elasticsearch. It includes multi-currency and multi-language setup.

We convert the currency already on the server side and store only the original data from the DB inside the Index. This allows us to customize any currency conversion and we don't need to update the product every time we update our exchange rates. Suppose someone uses a unique strategy for PriceApplicator. In that case, we instead do not save the price after calculation, but store it in the original format and only after Index returns the data, we convert it on the server side.

export function TranslatedTextKeywordMappingField(): estypes.MappingObjectProperty {
  return {
    type: 'object',
    properties: availableLanguages.reduce((acc, lang) => {
      acc[lang] = {
        type: 'text',
        fields: {
          keyword: {
            type: 'keyword',
          },
        },
      }
      return acc
    }, {} as Record<LanguageCode, estypes.MappingProperty>),
  }
}

const priceMappingField: estypes.MappingProperty = {
  type: 'nested',
  properties: {
    id: { type: 'keyword' },
    channelId: { type: 'keyword' },
    currencyCode: { type: 'keyword' },
    price: { type: 'integer' },
  },
}

const ProductVariantIndexDynamicTemplates: MappingTypeMapping['dynamic_templates'] = [

]

const ProductVariantIndexMappingProperties: { [key in keyof VariantIndexItem]: estypes.MappingProperty } = {
  // index date
  lastSyncedAt: { type: 'date' },
  productUpdatedAt: { type: 'date' },
  // product fields
  productId: { type: 'keyword' },

  productChannelIds: { type: 'keyword' },
  productCollectionIds: { type: 'keyword' },
  productFacetValueIds: { type: 'keyword' },
  productFacetIds: { type: 'keyword' },

  // This is used for full text search on the options via search component on storefront.
  productOptions: { type: 'flattened' },
  productOptionsGroups: {
    type: 'nested',
    properties: {
      code: { type: 'keyword' },
      id: { type: 'keyword' },
      name: TranslatedTextKeywordMappingField(),
      options: {
        type: 'nested',
        properties: {
          id: { type: 'keyword' },
          name: TranslatedTextKeywordMappingField(),
          code: { type: 'keyword' },
        },
      },
    },
  },
  productEnabled: { type: 'boolean' },
  productInStock: { type: 'boolean' },

  productName: TranslatedTextKeywordMappingField(),
  productSlug: TranslatedTextKeywordMappingField(),
  productDescription: {
    type: 'object',
    properties: availableLanguages.reduce((acc, lang) => {
      acc[lang] = { type: 'text' }
      return acc
    }, {} as Record<LanguageCode, estypes.MappingProperty>),
  },

  productPriceMax: priceMappingField,
  productPriceMin: priceMappingField,

  productAssetId: { type: 'keyword' },
  productPreview: { type: 'keyword' },
  productPreviewFocalPoint: { type: 'flattened' },
  productAssets: { type: 'flattened' },

  // variant fields
  variantUpdatedAt: { type: 'date' },
  variantId: { type: 'keyword' },

  variantChannelIds: { type: 'keyword' },
  variantCollectionIds: { type: 'keyword' },
  variantFacetIds: { type: 'keyword' },
  variantFacetValueIds: { type: 'keyword' },

  variantEnabled: { type: 'boolean' },
  variantInStock: { type: 'boolean' },
  variantDisplayStockLevel: { type: 'keyword' },

  variantName: TranslatedTextKeywordMappingField(),
  variantSku: { type: 'keyword' },

  variantOptions: {
    type: 'nested',
    properties: {
      code: { type: 'keyword' },
      id: { type: 'keyword' },
      name: TranslatedTextKeywordMappingField(),
      group: {
        type: 'object',
        properties: {
          id: { type: 'keyword' },
          name: TranslatedTextKeywordMappingField(),
          code: { type: 'keyword' },
        },
      },
    },
  },

  variantPrice: priceMappingField,

  variantAssetId: { type: 'keyword' },
  variantPreview: { type: 'keyword' },
  variantPreviewFocalPoint: { type: 'flattened' },
  variantAssets: { type: 'flattened' },
}