Closed bluemodus-mwills closed 1 week ago
Thank you for notifying us of the issue. We will proceed to troubleshoot and keep you updated on our analysis.
Thank you, Eva
Hello, I do not think that it makes sense to index the reusable items. The whole point of indexed items is that you can use them for searching on a live site. Reusable items can only be searched and viewed if they are shown on a site and they will be visible in a url of a WebPageItem. So indexing reusable items and webpageitems would result in a same page being indexed multiple times.
It makes sense to trigger reindexing if a reusable item is edited, because it can result in pages being changed. Basically if a reusable item is reindexed you should return all the WebPageItems from the FindItemsToReindex
method which are referencing the changed reusable item.
@bkapustik -- In general I agree with you. Most reusable content items should not be indexed as separate stand-alone items, but I gave you a scenario that makes sense and is common -- a reusable item that is an asset, like a PDF file, that should appear as a unique item in search results regardless of if or where it's linked on the site.
Imagine a reusable content item with fields like this:
Type: Resource Fields:
Such an item would need to be crawled independent of what web page it's linked from, or even if a page that linked to it was archived. Customers often want PDF files to appear directly in search results.
This would not lead to the same page being indexed multiple times, because the PDF and whatever page that links to it would be separate items in the index with separate URLs for viewing.
Additionally, the documentation states that reusable items are indexable, so this appears to be something that is supposed to work and does halfway.
Hello,
The scenario @bluemodus-mwills describes here makes sense to me. At the same time, I would like to say that the index should include every type of content configured for indexing, which I believe a reusable content item is. Subsequently, it is up to everyone how they will design the search results to render a link to a page or a link to download/display a PDF file, for example.
It would make sense even more if we process this one: #63
@bkapustik, please don't forget that we also have Algolia and Azure search, where we should offer consistent behavior in case we process this one.
Thank you, @DavidSlavik
Hello, the functionality is implemented in the pr #71
Summary Rebuilding an index removes all reusable content items from an index. This means that even though it’s possible to create an indexing strategy that includes reusable content, it is not truly supportable. Reusable content will be indexed when it is created or updated, but if an admin clicks Rebuild in the Lucene index management UI, the indexed content will be removed.
To Reproduce
DefaultLuceneIndexingStrategy
.FindItemsToReindex(IndexEventReusableItemModel)
method that adds changed reusable items to the index.MapToLuceneDocumentOrNull
handle the case where the item parameter is anIndexEventReusableItemModel
and create a Lucene document.In the XbyK admin create a Lucene index that uses the custom indexer.
Include some web channel types and a path that includes indexable web channel items.
Rebuild the index and observe the number of index entries. In my case it was 34.
Now edit and publish one Reusable Content Item.
Observe that the number of index entries increases by 1 (in my case 35) because reusable content is added to the index whenever an item is added or updated.
Rebuilt the index. Observe that the number of entries drops back to the original number. This is because rebuilds of the index do not include reusable content items.
Expected behavior When rebuilding the index, reusable content items would not be removed.
Library Version Version 0.3.1 (but I'm looking at the latest code)
Code observations It looks like DefaultLuceneClient is missing code in the
RebuildInternal
method to enumerate the reusable content items and queue them for indexing.