localgovdrupal / localgov_directories

Searchable, filterable, directories of information, and locations.
1 stars 3 forks source link

Unpublished directory items aren't listed in a directory channel #404

Open stephen-cox opened 1 month ago

stephen-cox commented 1 month ago

If an unpublished item is added to a directory channel it's listed in the channel. Likewise, if a directory item is unpublished or archived, it disappears from the channel. This happens whether you have permissions to view unpublished items or not.

This causes problems building a directory channel you want previewed before making it live.

stephen-cox commented 1 month ago

The simple fix for this is to disable the entity filter processor on the directories index (/admin/config/search/search-api/index/localgov_directories_index_default/processors) and then re-index all the content.

Screenshot 2024-10-18 at 14-18-03 Manage processors for search index Directories Drush Site-Install

@ekes Can you remember if there was a reason for excluding unpublished directory items?

Is there any reason why we shouldn't turn this processor off for new and existing installs?

ekes commented 1 month ago

Yes because there isn't necessarily an entity access check before showing content from search api. General rule of thumb, don't index stuff in search api you don't want to show to a user that can access results from it.

stephen-cox commented 1 month ago

So people would like to create a directory of unpublished items which they can then preview before publishing. If a directory channel can't show unpublished items because we can't put unpublished items in a search index then this simple isn't possible.

From my testing, there is an access check when viewing a directory channel. With the entity status processor turned off anonymous users only see published items while admins see all of them. Where else are we using the directory index that might not be doing access checks?

ekes commented 1 month ago

From my testing, there is an access check when viewing a directory channel. With the entity status processor turned off anonymous users only see published items while admins see all of them. Where else are we using the directory index that might not be doing access checks?

Any time it's configured with solr not to do entity loads but use the content direct. That's the most obvious one.

We'd have to check all the views we supply and make sure they make an entity load, and do an access check. Also for anything exposed as api feeds (openreferral).

Also not this index (but on the directories page) I will be making, if I can succeed, a new index of locations that will not even load the entities when retrieving them from the database. This is because once a directory has enough locations, putting all the dots on the map is causing 1000s of entity loads, and it hammers both sql and memory. So while we might be able to do it for the listing of services - with a warning or something to anyone who extends their directory that it could expose unpublished content; we really won't be able to do it with the points on the map. The alternative there is to save the published status too, and add that as a filter to our view, at least that would be clear, but particular work.

Oh! And the content (full text search) is indexed. It is done as 'anonymous user', what ends up in the token index, maybe no content then? Or would it cause matches to happen.