OnTheGoSystems / wpml-elasticpress

WPML ElasticPress adds support for languages in ElasticPress.
GNU General Public License v3.0
10 stars 3 forks source link

"php wp-cli.phar wpml_elasticpress" does not sync secondary language #49

Closed danielmuehlbacher closed 2 months ago

danielmuehlbacher commented 3 months ago

Hi there,

i'm using the following command:

php wp-cli.phar wpml_elasticpress sync

Although the command outputs that the DE and EN indexes are synced, this is not true. New content is not displayed on the page. When i do a manual sync in the wordpress admin interface, the new content is available in EN.

Is there any difference between the "php wp-cli.phar wpml_elasticpress sync" command and the manual sync in wordpress admin backend?

Thanks!

danielmuehlbacher commented 3 months ago

Do i have to use the --post-lang=de,en param?

decodekult commented 3 months ago

Hi Daniel

Let me see if I can help you. No, you should not need to add the language parameter. The sync one should do all the job, crawling and syncing all the content based in its language.

OK, let me first try to find out what the situation is. I believe that you have a site with a primary German language, and translations into English, right? And the problem that you are experiencing is that while admin backend syncing does properly sync analuyze and sync your English content, running that command line is not indexing it at all.

Would you mind checking the output of the command? And also, after running it, it would help me if you can check the stats dashboard, and focus on the boxes about the indexes health and their reports, and share with me what you see.

danielmuehlbacher commented 3 months ago

Hi,

thank you for your quick response!

Yes, the primary language is German and the secondary language English. All your assumptions are correct.

I just ran the CLI command; here is the output:

screen1 screen2

This is the index health: indexhealth

And here's the status report: elasticpress-report.txt

I hope that helps analysing the problem?

In addition; i have this custom code regarding elasticpress in the functions.php:


add_filter( 'ep_formatted_args', function ( $formatted_args ) { $formatted_args['track_total_hits'] = true;

return $formatted_args; } );

add_filter( 'ep_index_name', function ( $index ) { if( !is_admin() && defined( 'ICL_LANGUAGE_CODE' ) ) { if( ICL_LANGUAGE_CODE == 'en' ) { $index = 'wpwifoacat-post-1-en'; } }

return $index;

} );

function custom_ep_total_field_limit( $limit ) { return 15000; }

add_filter( 'ep_total_field_limit', 'custom_ep_total_field_limit', 10, 1 );

decodekult commented 3 months ago

Hi Daniel

I tried to reproduce your environment but apparently my indexing run properly. There must be somethign that I can not see, so let me rewind to the start.

I see in your screenshot that your index in the original language (German) has around 25.530 entries, while you index in the secondary language (English) has around 25.340 entries. There is a gap between those two, i assume that you have some content that is not translated.

When you say that "new content is not displayed on the page", can you clarify what it means? I would love to hear the whole process, like "I add a new translation to English for some German content, then I sync on the command line, but the new translation is not available on the site when I perform a search". Please be verbose on your details, any small piece of information can contain the key to finding what is happening.

Thanks in advance.

danielmuehlbacher commented 3 months ago

Hi,

thank you for your reply. I guess there must be some difference between the manual sync in WP admin and the CLI command?

For example there is a post which is translated in DE and EN. A custom meta query with 'ep_integrate' => true does return the post for DE, but not for EN. When i do a manual elasticpress sync in WP admin the EN post is returned after that. When i do call the CLI command it is not.

The whole thing occurs for content which is added through an API. So it's not content which is edited in WP admin, all posts are created from an API in DE and EN. When i edit a EN post in WP admin the elasticpress sync for this post works.

Thank you!

decodekult commented 3 months ago

Hi there, I feel that we are approaching :-)

Yes, there are differences between doing a manual sync in the dashboard and a cli sync, but the way that WPML ElasticPress interacts with the native syncing consists basically in looping over all the languages defined on the site and running the native ElasticPress synchronization, adjusting the indexes so data is stored in the right place.

As you see, when you do a cli indexing, your secondary language index does get populated. It seems to be missing some entries, which are indeed counted on when doing a dashboard sync and when editing the translation in the backend. Those are the three main mechanism to alter an index: dashboard syncing, cli syncing, and backend post update.

The fact that your missing data is loaded using an API rings a bell, I do feel that it is indeed important. There must be something happening on those elements when syncing from the dashboard that is missing when syncing from the cli. At first I thought that it might be something related to your filter on ep_index_name, given that the dashboard syncing happens via AJAX (hence it is considered to happen in the admin side) while the cli syncing happens... well, in the command line, hence it is considered the backend. But I am not that sure about it, to be honest.

There might be a large bunch of reasons why your API-generated content is not picked up by the routine that indexes content. For our side, it is enough that the item is marked as to belong to the right language. I am slightly out of ideas, so let me try and ask some more questions:

I am sure we will get to the bottom of this, we might just need time!

danielmuehlbacher commented 3 months ago

Hi there,

now i've tried different variations and checked a lot of things. That's what i found out:

I hope this helps with finding a solution? It must have something to do with the post status i guess.

Best regards, Daniel

decodekult commented 3 months ago

Hi Daniel

Wow, great information here! I think we are close, I have some ideas. For what I see, we do act upon posts being trashed or deleted, and we also act upon a queue that ElasticPress keeps on posts that were modified in the current request... but it seems that the transition between post status values is managed differently here: it is adjusted on-the-fly so it is escaping our integration.

So we are missing the synchronization between languages when a post is turned into a draft, but probably also into a status of needing revision, or even a private status.

Just to confirm the last pieces, could you please share which ElasticPress features you have enabled? I would also love to know if we have any other conflict with a (yet) unsupported feature.

danielmuehlbacher commented 3 months ago

Hi,

thank you very much :)

Regarding your question: I have disabled all features of Elasticpress because i'm only using the

'ep_integrate' => true,

integration in our custom queries. But of course "WPML integration" is activated on the "Features" page of Elasticpress.

If you need any other information please let me know!

Best regards!

danielmuehlbacher commented 3 months ago

@decodekult I just had another case on the live-website. 5 new posts (which changed from draft -> publish) were not displayed on the EN page.

The CLI script wpml_elasticpress sync did not help. Then i started the sync (without "delete all data"), now they are displayed correctly.

decodekult commented 3 months ago

Hi @danielmuehlbacher

We have a pull request that might solve at least part of your issues. The pull request covers switching a post from an indexable status (like published) to a non-indexable status (like draft or pending review), which should effectively remove that document from all the indices containing it - note that WPML does not sync by default all the post status changes from one post to its translations: if you turn a published post into a draft, its translations may stay published!

From what I read in your last comment, you also have problems with the opposite operation: turning drafts into published posts. This is somehow strange:

The only thing that comes to my mind right now is a problem in the first step: who the post is published. As far a sI understand, your content is gathered with an API and then some/all of them are transitioned from drafts to published. Could you please let me know how do you execute that status transition?

Thanks in advance.

danielmuehlbacher commented 3 months ago

Hi there,

thank you! I'm doing the publish this way via a cli script:

wp_update_post(array( 'ID' => $publication->ID, 'post_type' => 'publication', 'post_status' => 'publish' ));

I noticed that when a post is published this way, the

wpml_elasticpress sync

command does not work for displaying the post on the EN page. Only after running the manual sync in WP backend the post appears on the EN page. Could that somehow be possible?

Best regards, Daniel

decodekult commented 3 months ago

Hey @danielmuehlbacher thanks for the information. We are getting close.

Let me check a couple of things and get back here - the wp_update_post function should be calling the ElasticPress mechanism to sync the post, which does also fire the WPML ElasticPress mechanism to include the post in all required indices.

As far as I know, ElasticPress is collecting all posts that need updating and then processes that query at a later moment. We hook into that later moment and push the post to all the other indices that require it. Maybe we are missing the execution of that later moment somehow...

Just some more questions (sorry for this!):

Thanks in advance.

danielmuehlbacher commented 3 months ago

Hi,

the post is "translatable".

Before calling the wp_update_post function we are setting the current language via

do_action( 'wpml_switch_language', $lang );

The cli script which publishes the posts is called separately for DE and EN. At the time the script runs the two languages of a post may not be connected for wpml. This is done at a later time through another script which uses this function:

$set_language_args = array( 'element_id' => $postEN->ID, 'element_type' => $wpml_element_type, 'trid' => $original_post_language_info->trid, 'language_code' => 'en', 'source_language_code' => $original_post_language_info->language_code );

do_action( 'wpml_set_element_language_details', $set_language_args );

As a result of the script the post as the correct status in WP backend. But somehow the post_status change does not sync to the EN elasticpress index when called via wpml_elasticpress sync.

decodekult commented 2 months ago

Hi @danielmuehlbacher

I will get back to this as soon as I can. I need to focus on a couple of urgent tasks somewhere else, but I will not forget about this issue.

Thanks for your patience.

danielmuehlbacher commented 2 months ago

hi @decodekult ,

great, thank you!

Best regards, Daniel