Closed PutinVladimir closed 2 years ago
Thanks for this. Let me run a few texts.
One thing I need to see is if this would work if I can search for title only and body only from crp_related so that the title and content can be weighted. But, it appears that this might be the case.
You have not option in settings for "body only"...
Hi, I've been running some checks. Although in the current status the extra indices are not needed, I'm planning on implementing a change in the algorithm to use weighting system for the title and the content. This is similar to what I'm done with my other plugin Better Search. In this case the individual indices are needed else the query fails.
You are wrong again.
And one more limit heavy index lenght is a key which can reduce size with acceptable results...
Better Search generates a query as below. This uses a weighting with a separate MATCH for the post_title and the post_content. Running this query without the content and title indices i.e. keeping only the (post_title,post_content) index created SQL queries of missing FULLTEXT indices. I haven't tested deleting the combined index. I will give that a shot as well to see what I can do.
SELECT SQL_CALC_FOUND_ROWS wp_posts.*, ID,
(MATCH(wp_posts.post_title) AGAINST ('sherlock' ) * 10 ) + (MATCH(wp_posts.post_content) AGAINST ('sherlock' ) * 1 ) AS score
FROM wp_posts
WHERE 1=1
AND MATCH (wp_posts.post_title,wp_posts.post_content) AGAINST ('sherlock' )
AND (wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'inherit')
AND wp_posts.post_type IN ('post', 'page', 'attachment')
AND wp_posts.post_password = ''
ORDER BY score DESC
LIMIT 0, 10
And one more limit heavy index lenght is a key which can reduce size with acceptable results...
Can you elaborate more on this? Is this an explicit setting I need to do?
In context of query maybe you can do a trick when you need content only search but use complex index title+content add to query title too but multiplay score of title match to same value to became it unimpotent and not affect the result of content...
About limit if you have a lot of big posts which is 5000-20000 chars length if limit index on content as 1000 chars (in other plugins this is usualy an option in setting where you can choose full or limit size index) you will get 100mb index against original 1000mb and and same quality for this plugin (not for search plugin where whole text matter).
In v3.0.0, I've dropped the need for the post_content database. https://github.com/WebberZone/contextual-related-posts/commit/bd92596263781ace255d810476fe2cabf6cdea27
The other two indices are used depending on whether a user enables the setting to search in content.
FULLTEXT indexes is duplicate size of wp_post. If you have 100000 posts is typical size of wp_posts is 200-300mb. FULLTEXT index on post_content will be 200-300mb. You create 3 indexes: crp_related (post_title, post_content); crp_related_title (post_title); crp_related_content (post_content);
and you have only one option in settings "Find related posts based on content as well as title" so i can guess you always select similar post by title or by title+content and never content only of posts. So for good working enough only one index crp_related. Index crp_related_content never used but is very heavy.
Also not needed heavy index is poor for db caching kyes it will clear it every time if size of key pool less then size of all indexes, but it should be good if it will be bigger then size of only one index crp_related and not exist other index.
I dropped crp_related_title and crp_related_content and nothing changed. So check query plan and delete unneeded idx.