pantheon-systems / solr-power

A WordPress plugin to connect to Pantheon's Apache Solr search infrastructure, or your own!
https://wordpress.org/plugins/solr-power/
GNU General Public License v2.0
126 stars 61 forks source link

Native ACF Support #349

Open ataylorme opened 6 years ago

ataylorme commented 6 years ago

Not sure if this should be in Solr Power or another plugin to extend Solr Power. Let's discuss!

carl-alberto commented 4 years ago

Hi @danielbachhuber

Not sure is this is the correct issue to open this for the Gutenberg era but as far as we understand, by default, all text in the wp_post table > post_content column will be indexed by the plugin, but it seems it does not if it is inside an ACF block of meta data which was implemented using this method https://www.advancedcustomfields.com/blog/acf-5-8-introducing-acf-blocks-for-gutenberg/

We have a sample search for keyword treatment for a regular post, it returns the record if post_content is:

<!-- wp:paragraph --> <p>this text has treatment in regular Gutenberg</p> <!-- /wp:paragraph -->

but not when inside an ACF block:

<!-- wp:acf/image-heading-description-module { "id": "block_5ed5e323607fb", "name": "acf\/image-heading-description-module", "data": { "image": "", "_image": "field_5ec33f4338407", "content": "This text has treatment keyword in ACF", "_content": "field_5ec33e5b17b45" }, "mode": "edit" } /-->

 

Screen Shot 2020-06-14 at 11 07 34 AM

Do we have a known solution/workaround for this?

Thanks!

danielbachhuber commented 4 years ago

Hey @carl-alberto,

As it turns out, Solr Power runs the post_content through strip_tags() prior to indexing: https://github.com/pantheon-systems/solr-power/blob/458cd5bd889e38baf57cab6da9535b9ec4417a8d/includes/class-solrpower-sync.php#L273

strip_tags(), unfortunately, also strips out HTML comments.

If you'd like to only strip out HTML and not HTML comments, here's a code snippet you could apply on the site:

add_filter(
    'solr_build_document',
    function( $doc, $post_info ) {
        $doc->setField( 'post_content', wp_filter_nohtml_kses( $post_info->post_content ) );
        return $doc;
    },
    10,
    2
);

I've verified its behavior with #456. You'll need to re-index the site, however.