adobe / helix-run-query

service that executes queries on BigQuery datasets generated by Helix-Logging
Apache License 2.0
6 stars 11 forks source link

Add Vulcain Filters #101

Open trieloff opened 4 years ago

trieloff commented 4 years ago

In order to speed up the metadata retrieval of the query results, let's add Helix Vulcain filters and make sure that every query that has a page URL as the result also returns a metadata URL, i.e. a pointer to a JSON resource that has the most important page metadata like title, intro, hero image, etc.

MarquiseRosier commented 4 years ago

@trieloff Pointers to JSON resources? Do you mind breaking this down a little more?

trieloff commented 4 years ago

When you request https://adobeioruntime.net/api/v1/web/helix/helix-services/run-query@v2/top-blogposts?limit=10, you get a response like this:

{
    "results": [
        {
            "blog": "/posts/remove-the-mapping-misery-and-keep-the-creativity",
            "reqs": 905
        },
        {
            "blog": "/posts/2020/adobe-unveils-comprehensive-report-analyzing-effectiveness-of-premium-versus-non-premium-media",
            "reqs": 321
        },
        {
            "blog": "/posts/adobe-named-a-leader-in-forresters-latest-enterprise-marketing-software-suites-report",
            "reqs": 151
        },
        {
            "blog": "/posts/lorem-ipsum",
            "reqs": 92
        },
        {
            "blog": "/posts/creating-adobe-experience-platform-pipeline-with-kafka",
            "reqs": 69
        },
        {
            "blog": "/posts/2020/your-2020-creative-resolutions",
            "reqs": 47
        },
        {
            "blog": "/posts/2020/52-weeks-of-productivity",
            "reqs": 39
        },
        {
            "blog": "/posts/2020/how-imm-is-empowering-students-to-become-the-masters-of-their-own-digital-learning",
            "reqs": 35
        },
        {
            "blog": "/posts/2020/top-three-visual-trends-of-2019-from-behance",
            "reqs": 35
        },
        {
            "blog": "/posts/2020/challenge-yourself-make-2020-your-most-creative-year",
            "reqs": 28
        }
    ],
    "truncated": false
}

With helix-vulcain-filters, you'd be able to make a request and include the Preload: /results/*/blog header, asking the server to push the contents of the URL of that field, so the server would push for instance /ms/archive/posts/2020/how-a-cloud-native-cms-makes-content-delivery-faster-and-easier-content-creation-for-a-modern-era.html.

The problems with this are:

So the solution is twofold:

  1. find or create a JSON resource that renders the metadata of a given Helix Pages page. This might require adding a new template to helix-pages
  2. create a new field (e.g. meta) in the blog-posts query that points to the new URL, for instance /ms/archive/posts/2020/how-a-cloud-native-cms-makes-content-delivery-faster-and-easier-content-creation-for-a-modern-era.meta.json

As a consumer, you'd then change the header to Preload: /results/*/meta and get all the metadata URLs pushed to you, which can be parsed easily.