WordPress / Learn

WordPress.org Learn - The canonical source for the code and content behind https://learn.WordPress.org
277 stars 100 forks source link

Support excluding posts based on taxonomy in search queries #2843

Closed trakos closed 3 months ago

trakos commented 3 months ago

Attempt to fix https://github.com/WordPress/Learn/issues/2735

In https://github.com/WordPress/Learn/pull/2830 a taxonomy show was added. It also added a NOT IN condition to the search query, and posts with value hidden should not appear in search results.

It appears that Jetpack Search, which is used as a backend for searching lessons, does not support NOT IN queries for taxonomies out of the box. The code for converting tax_query is here. It returns early because queried_terms isn't set. If it would be set, it would actually be worse, as it would only include hidden results instead of excluding them.

In this PR, I propose to add two Jetpack Search filters to manually fix the query. One excludes show from terms in case queried_terms would be present, and the other converts NOT IN queries to must_not query in Jetpack Search.

I don't have access to .org sandbox, so I tested it by adding a hacky print_r:

if ( isset($this->query['s']) ) {
    $search = \Automattic\Jetpack\Search\Classic_Search::instance(47261184);
    $search->do_search( $this);
    echo '<pre>';
    print_r($search->get_search_result());
    die();
}

in wp-includes/class.wp-query.php on line 3161 (before applying filter posts_pre_query, which is what normally triggers Jetpack Search) in WP Learn Docker container. I've verified that the problematic post_id disappeared when visiting http://localhost:8888/lessons/?s=quiz&post_type=lesson.

trakos commented 3 months ago

cc @adamwoodnz

adamwoodnz commented 3 months ago

Thanks @trakos! Tested in sandbox and it works for the Quiz time lesson ✅

I added the term to another lesson, Test your knowledge, but it is still showing in the search results.

Screenshot 2024-08-16 at 12 15 03 PM

I ran 'Schedule Checksum' and then 'Schedule Fix', but that didn't help.

Could you please help me understand the lifecycle for this data; how soon should we expect lessons to be hidden after applying the term in the admin? It looks the cron runs nightly, but I was hoping my actions would have had the same effect...

adamwoodnz commented 3 months ago

Tested this again today and it does seem to be working, even if there is some delay. Merging, but it would still be good to understand the timing of data sync.

trakos commented 3 months ago

Could you please help me understand the lifecycle for this data; how soon should we expect lessons to be hidden after applying the term in the admin?

I think it should normally be updated in minutes. Indexing delay is set to 30 seconds, sync should take a couple minutes, and the search is cached for 5 minutes. That particular post was last re-indexed on 2024-08-18 22:56:36 UTC after it was modified on 2024-08-18 22:54:40.

Did you update the terms simply by editing the post through wp-admin? Trying to figure out if the right Jetpack Sync actions were called.

adamwoodnz commented 3 months ago

Did you update the terms simply by editing the post through wp-admin? Trying to figure out if the right Jetpack Sync actions were called.

Yeah just wp-admin. That timing seems consistent with what I've observed. Thanks again for your help.