Content Item links in Rich Text Components don't resolve correctly

kontent-ai / delivery-sdk-js

Kontent Delivery SDK for Javascript

https://kontent.ai

MIT License

50 stars 34 forks source link

Content Item links in Rich Text Components don't resolve correctly #319

Closed benmorana closed 3 years ago

benmorana commented 3 years ago

Brief bug description

When attempting to resolve the content-item links of a rich text field inside a rich text component, the item context values (second param) in the URLSlugResolver are no longer available. In our instance, we are defining a query level UrlSlugResolver to resolve all the URLs, and using values like the collection name to determine which site the content item belongs to, like so:


  private urlSlugResolver: ItemUrlSlugResolver = (link, context) => {
    const collection = context.item?.system.collection;
    // collection returns undefined 
    if (collection !== 'collection 1') {
      let urlPrefix = '';
      if (collection === 'collection_3') {
         urlPrefix = 'https://example-site-one.com';
      }
      if (collection === 'collection_2') {
         urlPrefix = 'https://example-site-two.com';
      }
      return {
        url: `${urlPrefix}${link.urlSlug === 'homepage' ? '/' : path || `/${link.urlSlug}`}`
      }
    }
    return { url: `${link.urlSlug}` }
  }

Repro steps

Create a Kentico Kontent project with two content types with the following properties, one with a rich text and a slug element (our Page type) and one with just a Rich Text field ( our Inner component type)
Create two Page content items, one with a slug and random text, and another one with a Slug and an Inner Page component embedded in the rich text field ( see screenshots below).
Create a content item link in the Inner component in the second page, that links to the first page:
Try to run urlSlugResolver at a query level in your code (use the sample code repo below) and notice that context for the links is not available

Expected behaviour

The item context should be available in the UrlSlugResolver, even when resolving a content items link inside a modular content item.

Test environment

Platform/OS: Node
Browser: Any
Version: 10.1.0

Reproduction

https://github.com/benmorana/linked-items-url-resolve-error - Have given you access :)

Enngage commented 3 years ago

Hi @benmorana, thanks for the issue!

I have one important question - in the case where context is not available, is the content item present in the response you get from Delivery API?

This is really important because when you add a link to content item within rich text, the actual content item is not added to response which is why the context cannot be made available - the content item itself is not available. This is different from using linked items because those are added to response based on sufficient depth.

In the end, I cannot recommend using content item links if you need additional data because the chances are that not all items will be available in response. You will probably have to do things a different way - maybe with using linked items or with a redirection route (so that you can load content item of url slug in redirection route and then redirect to proper route for that item - this way you wouldn't need additional context)

benmorana commented 3 years ago

Hi @Enngage,

Thanks for getting back to us so quickly on this issue! I can confirm that your assumption was correct - the content item that was being linked to was not included in the response.

We had to work through an intermediate solution for this problem to keep the project moving, but must admit it seems bizarre that content item links are not included in the API response. From what I can tell, this makes it impossible to fully take advantage of collections in any meaningful way. Instead we will probably have to revert to using plain old Web URLs to link between collections ( which is essentially the same thing as having content items in two different projects).

Do you know if there's any plans for the Delivery API to eventually include content item links as part of the API response?

Thanks again for helping us debug this issue :)

BorisPocatko commented 3 years ago

@Enngage If I may step in here: Can you also make our PMs aware of this issue? I think they are keen on using RTE a lot forward, but these kinds of issues will slow down or limit the adoption of RTE over linked items.

Enngage commented 3 years ago

@benmorana, @BorisPocatko There aren't any plans regarding this in neat future as far as I know. I've discussed this with our PMs several times in the past already, but I'll notify them again as its becoming quite often requirement / question..

Enngage commented 3 years ago

@benmorana This scenario will be possible in the next major version of this SDK as the resolver will accept Promises and so you can fetch additional content items from within the resolver. I can't say exactly when the update will be available though as there's gonna be a lot of changes and refactoring needed. I'm hoping to release it this autumn.

Enngage commented 3 years ago

@benmorana There is also quite a simple workaround you could do in the meantime before this is implemented in SDK - before resolving rich text with SDK, use a regex to parse out all links from the rich text field's raw value (this is just a html string) and get hold of all content ids. Once you have all ids, send a single request with system.id[in] filter. This will get you all content items needed and then you can proceed with resolving rich text. In resolver you simply check content item id, get it from your previously loaded array and construct URL.

Enngage commented 3 years ago

@benmorana One good thing :) I've created a sample code that does what I outlined in my previous message - it parses out all link ids from a certain rich text element, fetches all content items in a single response and uses it to construct url using the collection_id. Working example can be found here: https://stackblitz.com/edit/angular-ivy-y5tnl4?file=src/app/app.component.html

Code (for future reference when stackblitz is not available):

 const deliveryClient = new DeliveryClient({
      projectId: 'b9ec74cf-7cb8-011a-57b1-2bd5e0ad7477',
    });

    const linkContentItems: ContentItem[] = [];

    const items = (await deliveryClient
      .items()
      .equalsFilter('system.type', 'article')
      .queryConfig({
        urlSlugResolver: (link, context) => {
          // find content item from link
          const linkContentItem = linkContentItems.find(m => m.system.id === link.linkId);
          return {
             url: `my-url/${linkContentItem?.system.collection}/${linkContentItem.system.codename}`
          }
        }
      })
      .toPromise()).items;

    // parse all links and get their content item ids
    const contentItemIds = this.extractContentItemIds(
      items.map(m => m.text.value)
    );

    console.log('extracted link content item ids', contentItemIds);

    // fetch all content items from kontent in a single query
    linkContentItems.push(...(await deliveryClient
      .items()
      .inFilter('system.id', contentItemIds)
      .toPromise()).items);

    console.log('fetched link content items', linkContentItems);

    // resolve rich text
    for (const item of items) {
      const resolvedHtml = item.text.resolveHtml();
      console.log(`resolved html for '${item.system.codename}'`, resolvedHtml);
    }
  }

  private extractContentItemIds(texts: string[]): string[] {
    const ids: string[] = [];

    for (const text of texts) {
      const regex = new RegExp(`<a data-item-id="([a-zA-Z--_]*)"`, 'g');
      let match;

      while ((match = regex.exec(text)) !== null) {
        if (match.length && match.length >= 2) {
          ids.push(match[1]);
        }
      }
    }

    return ids;
  }

benmorana commented 3 years ago

@Enngage - That's great! Thanks for taking the time to come up with this solution. I can see what you're trying to do, but in our case I'm not sure this will totally work for us, given that most of the text content item links are located within Rich Text components inside the Page Body rich text element (i.e Body(Rich Text Field) -> Standard Page Block (Inline Component) -> Text with content item link. Because there are multiple different "blocks" or components that could contain content item links, we'd have to parse through all of the linked items first and execute regex matches for the main rich text fields before we can query the additional ids that will be used in the UrlSlugResolver. From a performance perspective that could be quite problematic.... Do you know if there's a way in the SDK to render out all of the HTML (including the one from the nested components) in one go? I'll have a look to see if we can adapt the solution you provided to our specific use case, but in any case thanks again for taking the time to look into 😄

Enngage commented 3 years ago

@benmorana I wouldn't be too much worried about the performance of the regex - it's miniscule compared to the performance of making actual http requests and fetching all the items. For a reference, I took the code above and modified it to run the regex 10 000 times and extracting all links took 9.7ms . 100 000 iterations took 64mson my (high end) machine.

perf

If you expect to iterate through more than 10k rich text elements you should probably do it completely differently and think about using Jamstack or something :) I'm guessing you will be nowhere these limitations because even the Delivery API has a limit on the number of content items that can be included within a single response - 2 000 (https://docs.kontent.ai/reference/delivery-api#tag/API-limitations)

This means that even if you had all 2 000 in your single response, all of them would need to have 5 rich text elements to even hit the 10 000 mark and we would still be looking at say 5-20ms worth of parsing time. It would depend on the length of the actual text of course, but I highly doubt it would make any "visible" difference.

Using the approach from above will most likely be a lot more efficient because you can manually "select" rich text elements from which you want to parse the links. SDK would likely have to do this for all the rich text elements in all content items, but maybe you can already filter this a bit.

Also, you don't need to resolve anything with the SDK in order to get all links, right? By going through all the rich text elements in the response you are able to parse all links that appear anywhere in the response so when it comes to actually resolving link with URL Slug resolved, the content item will already be there.

benmorana commented 3 years ago

@Enngage - Thanks again for the detailed response! We'll probably still need to test this on our end to make sure that this extra processing ( on top of some existing additional calls that we're currently making as part of a 'page' request) doesn't impact load times ( based on your tests it sounds like it shouldn't 🤞 ). Since cross-collection linking isn't entirely necessary for our first release, we'll look to introduce your solutions in the Phase 2 of our project. Like this we'll be able to effectively test the before/after. Thanks again for your help in finding an effective interim solution 😄