WebDevStudios / wp-search-with-algolia

Improve search on your site. Autocomplete is included, along with full control over look, feel and relevance.
https://wordpress.org/plugins/wp-search-with-algolia/
138 stars 54 forks source link

Documentation: Splitting Records - Large Content Flexible Sections #393

Closed samfrank closed 4 months ago

samfrank commented 7 months ago

Hey all!

I have a large amount of content built using ACF Flexible Content which breaches the bytes allowance set by Algolia, so I need to work out how I can I split the data up in multiple records as seen on the documentation here

I can see that this is working perfectly when I applying a large amount of text to the content field, which can be traced to this function here

I would like to be able to customise or replicate this function to work with flexible content, so each item in the flexible content will be its own record which would would optimise our search functionality

At the moment, I am using a add_filter to manipulate the attributes on all posts and pages, however is there a filter that will allow me to edit the way that the post/ page is synced directly? I can see on the wiki that there is a list of all the actions and filters but there is no documentation on what they actually do, just the params they except

If anyone has any ideas on how to achieve this or discussions that relate to this then please let me know

Thank you in advance

tw2113 commented 7 months ago

How much are you using the standard post content for the post type in question, if at all? Curious if we could swap out the content for the standard "post content" with your flexible content spot, so that it's the part that determines splitting records, and store the "post_content" content into a separate property.

samfrank commented 7 months ago

I am not using the standard post content/ classic editor at all on this site. I tried applying the array of sections to the 'content' key, but it would not save, I assume because it is being overwritten by an empty string, when looking for the standard content data

tw2113 commented 7 months ago

Try using https://github.com/WebDevStudios/wp-search-with-algolia/blob/c458917fd4f1862adb11dc24da99da224f878267/includes/indices/class-algolia-searchable-posts-index.php#L144-L146 to replace the $post->post_content return value with your flexible content. The $post object passed as the 2nd parameter can be used to fetch it. It'll then be run through the the_content filters for standard parsing, and used for the split content afterwards.

samfrank commented 7 months ago

I don't think this would work as my flexible content is not a string, so I think it will error if goes through these filters.

tw2113 commented 7 months ago

Well you have string content of some sort in them, it'd just be a matter of parsing them out into the actual content to be indexed first, before filtering into this spot. Likely very similar to how you're outputting them for your site visitors now. ACF if have rows, while have rows, etc. Build up the final result for the sake of the Algolia index/object properties.

tw2113 commented 7 months ago

Any thoughts or new developments here @samfrank ?

Bjornnyborg commented 4 months ago

@samfrank I had the same problem recently, i decided to build a small custom formatter for the flexible content data. My ACF content contained a lot of data, that i wasn't gonna use in Algolia anyway, so i simply stripped all the unnecessary data, and only indexed the relevant data.

This is my function:

<?php

add_filter( 'algolia_post_shared_attributes', 'custom_attributes', 10, 2 );
add_filter( 'algolia_searchable_post_shared_attributes', 'custom_attributes', 10, 2 );

function recursively_get_text_from_acf( $array , $props = []) {
    $text = '';

    foreach ( $array as $key => $value ) {
        if ( in_array( $key, $props ) ) {
            if ( is_array( $value ) ) {
                $text .= recursively_get_text_from_acf( $value );
            } else {
                // Only add text if $value is a string to ensure it's text content.
                $text .= strip_tags( $value );
            }
        }
    }

    return $text;
}

function format_flexible_content( $blocks ) {
    $blockContent = [];

    $props = [
        "text", 
        "content", 
        "heading", 
        "name", 
        "title", 
        "quote", 
        "main_content", 
        "accordion_elements", 
        "indhold"
    ];

    foreach ( $blocks as $block ) {
        $data = "";

        foreach ( $block as $key => $value ) {
            if ( in_array( $key, $props ) ) {
                if ( isset( $block[ $key ] ) && ! empty( $block[ $key ] )) {
                    if ( is_string( $block[ $key ] ) ) {
                        $data .= strip_tags( $block[ $key ] );
                    } else if ( is_array( $block[ $key ] ) ) {
                        $data .= recursively_get_text_from_acf( $block[ $key ] , $props);
                    }
                }
            }
        }

        $blockContent[] = $data;
    }

    return implode("\n", $blockContent);
}

/**
 * @param array   $attributes
 * @param WP_Post $post
 *
 * @return array
 */
function custom_attributes( array $attributes, WP_Post $post ) {
    if ( 'page' === $post->post_type ) {
        $blocks = get_field( 'flexible_content', $post->ID );
        $attributes['flexible_content'] = format_flexible_content( $blocks );;
    }

    return $attributes;
}

By only adding the necessary data, i went way below the record limit. 😊

tw2113 commented 4 months ago

Looks like a solid and viable solution @Bjornnyborg. Thanks for sharing.

samfrank commented 4 months ago

Thanks for sharing @Bjornnyborg - will definitely be using this approach in the future!

tw2113 commented 4 months ago

At least at this time, we're not going to make new edits to the core plugin as this is conditional need.

However, I think I'm going to convert this to a documentation task as a way to index and condense flexible content into something that can be worked with.

@Bjornnyborg To be certain, your props array is going to be the field names/keys for all the subitems in a Flexible Content field type? Wanting to make sure I reference things properly as we get whatever documentation we can out of this.

Bjornnyborg commented 4 months ago

@tw2113 Yes exactly... The function recursively loops through all the propertis of the JSON object, and only saves the data that has a key equal to one of the strings in the $props array.

A potential issue, is if you want a property called "title" in level 2, but you also have a title in level 1 of the tree - then this function would save both to Algolia. :)

Hope this makes sense!

tw2113 commented 4 months ago

Not direct wiki/docs integration, but I have added the snippet to our Snippet Library repo at https://github.com/WebDevStudios/algolia-snippet-library/blob/main/indexing/index-acf-flexible-content.md