divio / aldryn-search

Haystack 2.0 search index for django CMS
Other
48 stars 77 forks source link

Eviction of unpublished pages fails #50

Closed jakob-o closed 8 years ago

jakob-o commented 8 years ago

Hi,

another issue I am having is that a page once published it will not be evicted from the index when unpublished again, since should_update returns false and no delete action is issued. This could be solved by adding a published flag on the index and filter on query time. To avoid doubling the index size for pages we could even check if instance.page.publisher_public exists, to make sure we do not add title drafts that have no public page (meaning they are draft only). If get_published on the index by default returns true this should be mostly backwards compatible. The exception would be a newer index implementation in for e.g. aldryn newsblog and an outdated aldryn search implementation (which then would not filter on the published flag). If you agree that this would be a good idea, I would be happy to create pull requests for this and the other aldryn repositories.

Regards

Jakob

czpython commented 8 years ago

Hello Jakob,

Thanks for the suggestion but I don't like the idea of indexing unpublished titles or any unpublished object for that matter. Anything that's searchable should be "published". The logic of filtering by published just adds more overhead since it's something that would happen 90% of the cases.

To solve this we can leverage the realtime signal processor and extend it or even provide another one that calls handle_delete when the page is being unpublished.

jakob-o commented 8 years ago

Sounds good to me. Does this mean, indices should issue a delete signal in the should_update call when detecting an unpublishing, or where would this be located?

czpython commented 8 years ago

This would be located in a custom signals processor that subclasses from haytstack's.

There wouldn't be a delete signal, when a page is unpublished it's save() method is called so the hastacyk handle_update method on signal handler is called, so in this method we need to add logic to check if the instance is a cms page and if it's being unpublished, if so then call the handle_delete method on the same class.

jakob-o commented 8 years ago

Hey,

I just gave this a thought and would suggest to implement a AldrynSignalProcessor that inherits from the RealtimeSignalProcessor including an additional post_unpublish signal that is connected to self.handle_delete. This indirection would allow to implement unpublishing detection in external apps (such as aldryn newsblog) in post_save signals that would then send a post_unpublish signal. If you agree, this is the way to go, I would be glad to implement it and create a pull request.

Regards

Jakob

jakob-o commented 8 years ago

I created an implementation for reference here https://github.com/aldryn/aldryn-search/pull/54 Any feedback on improvements for design / implementation are very welcome.

czpython commented 8 years ago

Fixed by #54