magento / architecture

A place where Magento architectural discussions happen
275 stars 153 forks source link

ECP-469: Finalize new search schema #368

Closed kokoc closed 4 years ago

kokoc commented 4 years ago

Overview

This PR contains graphql schema changes required to introduce new search functionality such as

The main challenge is to find a proper place in existing/new queries for new fields.

Inject into one of the existing types/queries

products query

type Query {
    products (
        search: String,
        filter: ProductAttributeFilterInput,
        pageSize: Int = 20,
        currentPage: Int = 1,
        sort: ProductAttributeSortInput
    ): Products
}
type Products {
    items: [ProductInterface]
    page_info: SearchResultPageInfo
    total_count: Int
    filters: [LayerFilter]
    aggregations: [Aggregation]
    sort_fields: SortFields
}

products query covers two main use cases: search by search phrase and catalog browsing with filtering by category. This query is the most obvious choice for search-related customizations. However, this query returns a Products type with ProductInterface inside. The ProductInterface is designed to represent a product entity with complex product types support.

Adding fields to ProductInterface or Products will make those fields optional and useless in "catalog browsing" scenario. Other words, our query is handling two different use cases with two different results. Mixing those results in very common ProductInterface is messy, inconvenient and just strange.

Products type is a little bit better place for new fields. We can put request-specific fields here, but we also need to introduce one more items conainer for search results(will contain highlights + products). This approach is still messy, but at least it doesn't affect other product usecases (shopping cart item, wishlist item, pdp, etc). Example for this option:

type Query {
    products (
        search: String,
        filter: ProductAttributeFilterInput,
        pageSize: Int = 20,
        currentPage: Int = 1,
        sort: ProductAttributeSortInput
    ): Products
}
type Products {
    items: [ProductInterface] @deprecated(???????)
    search_items: [ProductSearchItem]
    suggestions: [String]
    page_info: SearchResultPageInfo
    total_count: Int
    filters: [LayerFilter]
    aggregations: [Aggregation]
    sort_fields: SortFields
}
type ProductSearchItem {
    product: ProductInterface!
    highlights: [HighlightType]
}

other queries with product results

There are two more queries which return products and could be considered as a potential candidate for new fields: category and categoryList. Both return CategoryInterface interface with products field inside:

interface CategoryInterface
    id: Int
    ...
    default_sort_by: String 
    products(
        pageSize: Int = 20 
        currentPage: Int = 1 
        sort: ProductAttributeSortInput 
    ): CategoryProducts 
    breadcrumbs: [Breadcrumb] 
}

type CategoryProducts 
    items: [ProductInterface] 
    page_info: SearchResultPageInfo 
    total_count: Int 
}

This interface also returns ProductInterface and also those queries do not receive a search phrase. Thus it's they are not a good candidates for new search functionality.

Proper place

In order to find the proper place we may take a look on the use cases we have and want to support:

I think the following table is the desired assigment of queries per use cases:

Scenario Current Query Suggested Query Required changes
Browsing by category products
category
categoryList
category Add layered navigation results
Add filters to product sub-request
Searching by phrase products new query deprecate search related fields in products query
Advanced search n/a TBD TBD
Probably can be omitted
PDP products products deprecate filtering except product id
Category Navigation categoryList categoryList deprecate all non-category fields

You can find a possible shape of new query in PR content.

From the table above we can see that products query will be used mostly for retrieving the product by their ids, other functionality will be transferred to productSearch query. Thus we may also deprecate entire products query entirely and create a new query designed to get products by ids.

Here is a possible shape of the api if we decide to deprecated old query:

type Query {
    "This query is deprecated by still works for backwards-compatibilty purposes"
    products (
            search: String,
            filter: ProductAttributeFilterInput,
            pageSize: Int = 20,
            currentPage: Int = 1,
            sort: ProductAttributeSortInput
        ): Products @deprecated("See product or productSearch queries")

     productSearch(
         phrase: String!,
         "Desired size of the search result page"
         pageSize: Int = 20,
         currentPage: Int = 1,
         filter: [SearchClauseInput],
         sort: [ProductSearchSortInput]
     ): ProductSearchResponse!

     product(id: ID!): ProductInterface
}

Here are few possible examples per use case:

Category Listing

category(id: 17) {
    products {
        items {
            id
            name
            sku
            thumbnail {
                url
            }
        }
    }
}

Search by phrase

productSearch(phrase: "Bags") {
    items {
       product {
            id
            name
            sku
            thumbnail {
                url
            }
       }
       hightlights {
           attribute_code
           value
       }
    }
}

Product details page

product(id: 18) {
    id
    name
    sku
    media_gallery_entries {
        url
        media_type
        content
    }
}

Category navigation

categoryList {
    id
    name
    url_path
    children {
        id
        name
        url_path
        children {
            id
            name
            url_path
        }
    }
}
kandy commented 4 years ago

Can I ask you to describe scenario that will be covered by this API?

DrewML commented 4 years ago

@kandy: Can I ask you to describe scenario that will be covered by this API?

Would also like to see this, with a few example queries showing how the front-end would cover the scenarios

paales commented 4 years ago

Posted the following question earlier in Slack:

Question/feature request about product aggregations:

To get the layered navigation working properly pwa-studio is/we are relying on introspection queries because the aggregations don’t offer all the information required to build a proper layered navigation. This makes building frontends rather complex.

Wouldn’t it be better to actually have an AggregationInterface that will be implemented with:

  • AggregationEquals must implement the options { label, value }. Each option must also implement swatch information?
  • AggregationMatch must not implement options
  • AggregationNumberRange must imlement options: { label, from, to } instead of the concatenated string 100_* which requires custom handling, Can I just pass that value back to ProductAttributeFilterInput? The values there should be Int or Float, not String?
  • AggregationPriceRange can be implement to indicate a field should be rendered with a price in mind, else its the same as AggregationNumberRange
  • AggregationBoolean must not implement options

And @kokoc responded to me that this should be covered by this new search api and gave me the possibility to actually check it out! 🎉 So I've been trying out the proof-of-concept graphql endpoint (https://search-api-dev.magento-ds.com/graphql) and have some feedback / questions:

SearchClauseInput

Compared to the older products query we're loosing some functionality here: We are now unable to discover the attributes and possible input types (equals, range, match(?)).

As a developer I'd like to be able to discover which attributes are available for filtering like it was possible with ProductAttributeFilterInput so that I don't need to 'guess' which ones are available.

A great alternative might be something we see in our GraphQL CMS (GraphCMS):

input ProductSearchFilterInput {
  category: ID
  category_not: ID
  category_in: [ID!]
  category_not_in: [ID!]
  createdAt: DateTime
  createdAt_not: DateTime
  createdAt_in: [DateTime!]
  createdAt_not_in: [DateTime!]
  createdAt_lt: DateTime
  createdAt_lte: DateTime
  createdAt_gt: DateTime
  createdAt_gte: DateTime
  handle: String
  handle_not: String
  handle_in: [String!]
  handle_not_in: [String!]
  price: Float
  price_not: Float
  price_in: [Float!]
  price_not_in: [Float!]
  price_lt: Float
  price_lte: Float
  price_gt: Float
  price_gte: Float
  size: Float
}

This keeps the query arguments relatively flat, which is nice.

Aggregation and Bucket

When rendering an aggregation (should it be renamed to Facet?) on the frontend we should be able to see which type of facet we're actually rendering. Wouldn't something like the following be a better idea:

interface Bucket {
    title: String!
}
type ScalarBucket implements Bucket {
    id: ID!
    count: Int!
}
type RangeBucket implements Bucket {
    from: Float!
    to: Float!
    count: Int!
}

interface Facet {
    attribute_code: String!
    buckets: [Bucket!]!
}
type StatsFacet implements Facet {
    attribute_code: String!
    min: Float!
    max: Float!
    buckets: [ScalarBucket!]
}
type ScalarFacet implements Facet {
    attribute_code: String!
    buckets: [ScalarBucket!]!
}
type RangeFacet implements Facet {
    attribute_code: String!
    buckets: [RangeBucket!]!
}

Demo implementation

There are some inconsistencies on the demo endpoint that I think will/should be resolved at some point. So not sure how relevant they are:

I hope you find this valuable!

paales commented 4 years ago

Oh, I forgot something, Shouldn't ScalarBucket be have additional implementations like ImageBucket, ColorBucket and TextBucket to account for swatch information?

DrewML commented 4 years ago

@kokoc @nrkapoor what's the status with this PR? Seems like @paales might have some useful input