Philo / Umbraco.Elasticsearch

Integration of Elasticsearch as a search platform for Umbraco v7.5+
MIT License
15 stars 8 forks source link

un published items #24

Closed ismailmayat closed 5 years ago

ismailmayat commented 5 years ago

The package under the hood is using content service, does it index both published and unpublished items?

Philo commented 5 years ago

Hi @ismailmayat. ES indexing (adding/updating/removing) is as you say triggered via the ContentService and MediaService events primarily.

See https://github.com/Philo/Umbraco.Elasticsearch/blob/master/src/Umbraco.Elasticsearch.Core/EventHandlers/SearchApplicationEventHandler_Base.cs#L50 for details on the specific events that are wired up. The ContentService.Saved event is not by default wired up.

If you wanted to you could probably wire this up yourself and use similar code to that used for the other events to achieve an indexing trigger on content saved?

Does this answer your question?

ismailmayat commented 5 years ago

Not quite I just wanted to confirm that unpublished items do not end up in index. Which if you only have content published event should not. So I have an index and they have a load of unpublished items. This was due to on unpublish the removal not happening due to us using own key for indexing. We fixed this issue then do full rebuild but unpublsihed items are still there. If we publish and then unpublish those items they are removed.

ismailmayat commented 5 years ago

Ok so update on this. When I delete the index then recreate it then rebuild all is good. All my residual items are gone. I checked the source code and it looks on a rebuild you do not delete the index? I thought that a rebuild deletes it then does full rebuild?

Regards

Ismail

Philo commented 5 years ago

The indices are all listed within the developer section, new indices are created with a timestamp based suffix and an alias is used to "point" to the active index. Once an index is not active it can be deleted manually. They are not automatically removed.

If you had documents that were associated with Umbraco nodes, then when they were deleted/trashed were not removed from the ES index then there maybe an issue to investigate

If your ES index contains items that are not associated (by key) with an Umbraco node there will be no way for that document to be removed as it will not be contained in any triggered content or media events. In this case it will require that a new index is build to cleanse any orphaned documents.

The index creation process is designed to allow a new index to be created in place without affecting the currently active index. Once they index build is completed you can then make the new index active.