WordPress / wp-movies-demo

Demo of the Interactivity API
https://wpmovies.dev
GNU General Public License v2.0
195 stars 42 forks source link

Populate WordPress DB with TMDB data on a daily basis #11

Closed SantosGuillamot closed 1 year ago

SantosGuillamot commented 1 year ago

I'll work tomorrow on creating a better description, sorry πŸ™

The goal of this PR is to create a cron event that runs on a daily basis:

Keep in mind that, to be able to fetch the data from the TMDB API, users will need an API Key.

Moreover, I am taking the opportunity to add more custom fields with new information about the movies and actors.

The way it works, the code is triggered for the first time when the plugin is activated, and it starts adding movies, images, and actors in the background (users can keep navigating through the Admin dashboard while the content is imported.

Right now, the code fetches a bunch of movies and follows this workflow:

For each movie

1. Insert or update the movie into WordPress DB with a bunch of fields.

In order to check if we have to create a new post (wp_insert_post) or update an existing one (wp_update_post), we check if a post with the guid already exists. The guid format is $site_url?tmdb_movie_id=TMDB_movie_id. Something like https://moviesdemo.com?tmdb_movie_id=315162.

These are the fields we are adding:

And a bunch of custom fields linked to the movie, that could be used in a future phase:

2. Upload Featured Image to the current movie

3. Upload backdrop Image to the current movie

It downloads, uploads, and attaches to the post the backdrop image, which is an image we could include in the background. We create a custom field named _wpmovies_backdrop_img_id for this.

4. Upload more images related to the movie

It downloads, uploads, and attaches more images that can be used for the post. We create a custom field named _wpmovies_images that stores an array with the images ids.

5. Add some videos related to the movie

It creates a new custom field named _wpmovies_videos with a bunch of urls pointing to videos related to the movie.

6. Add recommended movies

It creates a new custom field named _wpmovies_recommended with an array of recommended movies ids. We could check the guid and, if it exists, show them in the movie post somehow.

7. Add similar movies

It creates a new custom field named _wpmovies_similar with an array of similar movies ids. We could check the guid and, if it exists, show them in the movie post somehow.

8. Add movie genres as categories

It checks if the genres exist as categories, it adds them as categories if they don't exist, and attach all those categories to the movie.

9. Add the movie to the movies_tax term

As movies act as taxonomies for the actors, it needs to be created if it doesn't exist.

10. Add the actors of the movie

Each actor needs to follow a similar process to the movies because they are posts and taxonomies as well. It is explained below.

For each actor

1. Insert or update the actor into WordPress DB with a bunch of fields.

To know if we should create or update the actor post, we check the guid already exists. The guid format for actors is $site_url?tmdb_movie_id=TMDB_actor_id. Something like https://moviesdemo.com?tmdb_actor_id=3131.

These are the fields we are adding:

And a bunch of custom fields linked to the actor:

2. Upload Featured Image to the current actor

The same way we are doing with the movies.

3. Add movie taxonomy to the actor

We have to add the current movie as a taxonomy (the one we created previously) for the actor.

4. Create actor taxonomy and link it to the current movie

michalczaplinski commented 1 year ago

This is great, Mario!

Could we actually run this script on WP initialization as well? Instead of constructing the XML and having the user load it manually in the database like we’re doing right now.

SantosGuillamot commented 1 year ago

I've just updated the opening post with more information.

Could we actually run this script on WP initialization as well? Instead of constructing the XML and having the user load it manually in the database like we’re doing right now.

Right now, it starts adding content whenever you activate the plugin, and it adds an event to do it once per day. However, keep in mind that if the TMDB API is not valid, it won't work.

michalczaplinski commented 1 year ago

Right now, it starts adding content whenever you activate the plugin, and it adds an event to do it once per day. However, keep in mind that if the TMDB API is not valid, it won't work.

@SantosGuillamot

Makes sense. It's still valuable to repopulate the DB on a daily basis for the live demo!

Is it ready to be reviewed now? πŸ™‚

SantosGuillamot commented 1 year ago

Is it ready to be reviewed now? πŸ™‚

I wanted to polish the code and maybe reuse some of the functions, but I guess it should be fine for a review, yes πŸ™‚

Moreover, I wanted to add to the template some of the new data added to the DB, but I can do that in another PR.

michalczaplinski commented 1 year ago

I've been trying to run this branch but I ran into a couple of issues:

  1. If I'm not mistaken, the movie data is not updated when you activate the plugin - it currently just schedules the cron job, right? I had to add the function call after the cron job definition - it seems to work then πŸ™ .

    Screenshot 2023-02-17 at 15 10 41
  2. I've made this change but then ran into this bug: It seems that the custom taxonomies like movies_tax are created in the init action. But the movies are being added upon plugin activation. When that happens, the taxonomies are not yet registered. So, I've made this change to the code: ea6f543

  3. Now, I've also realized that this approach is probably not the best as we'll be calling this expensive wpmovies_add_movies function for every plugin activation.

SantosGuillamot commented 1 year ago

If I'm not mistaken, the movie data is not updated when you activate the plugin - it currently just schedules the cron job, right? I had to add the function call after the cron job definition - it seems to work then πŸ™ .

Mmm that's weird. I tested it using Local, and everything works fine. But you are right that testing it with wp-env doesn't seem to work.

In the wp_schedule_event function, we are passing the current time as the first argument, which is meant to be the UNIX timestamp of the first time this task should execute. This is also mentioned in the CRON documentation.

So, with this in mind and looking at different examples, I believe it should work without having to call the function. Maybe it is an issue with wp-env? As I mentioned, it seems to work fine with Local.

It seems that the custom taxonomies like movies_tax are created in the init action. But the movies are being added upon plugin activation.

From what I see, the wpmovies_register_taxes function was never called in the code, so they weren't registering at all. I guess we can call the function on plugin activation as you did or using an action like custom post types are doing.

Now, I've also realized that this approach is probably not the best as we'll be calling this expensive wpmovies_add_movies function for every plugin activation.

If I am not mistaken, using register_activation_hook( __FILE__, 'movies_demo_plugin_activation' ); as we are doing, the function is only called when the WP Movies Demo plugin is activated. Other plugins shouldn't trigger it. Moreover, we are checking if a cron event for the movies exists before calling it again, so this should also prevent it from running more times than it should. I mean this code: if ( ! wp_next_scheduled( 'cron_wpmovies_add_movies' ) )


Not related to this, but while testing it with wp-env I realized that we could remove these plugins right? We only need Gutenberg if I am not mistaken.

SantosGuillamot commented 1 year ago

So, with this in mind and looking at different examples, I believe it should work without having to call the function. Maybe it is an issue with wp-env?

Regarding this, I've been running a few more tests and it seems cron events aren't running while using wp-env indeed. It isn't something related to our plugin I believe. These are the cron events of a fresh site using Local and using wp-env (without our plugin installed).

Using Local

Screenshot 2023-02-20 at 14 42 15

Using wp-env

Screenshot 2023-02-20 at 14 44 14

The events with "now" should run whenever any page is loaded. But, as we can see, it doesn't happen.

SantosGuillamot commented 1 year ago

It definitely seems an issue with @wordpress/env as discussed in this issue. I've added this workaround to make it work while using @wordpress/env, and I reverted the changes to add that function. Anyway, I wouldn't consider it 100% reliable while working with @wordpress/env, but I am not sure if that's too important as long as it works in the production site. Apart from that:

SantosGuillamot commented 1 year ago

Apart from that, I added some code to show the most popular movies first in the query loops.

michalczaplinski commented 1 year ago

checking if a cron event for the movies exists before calling it again, so this should also prevent it from running more times than it should. I mean this code: if ( ! wp_next_scheduled( 'cron_wpmovies_add_movies' ) )

yes, that's right! πŸ‘ I was only noting that this wouldn't work when just calling the function directly (like I did).


Not related to this, but while testing it with wp-env I realized that we could remove these plugins right? We only need Gutenberg if I am not mistaken.

True, we don't really need the custom-post-ui and the create-block-theme plugin. Although they are nice to have for development - I've used the create-block-theme a lot to iterate on the designs and save them back in the repo. We'll still need the wordpress-importer plugin in order to be able to import the XML files though.


From what I see, the wpmovies_register_taxes function was never called in the code, so they weren't registering at all. I guess we can call the function on plugin activation as you did or using an action like custom post types are doing.

It was being called in https://github.com/c4rl0sbr4v0/wp-movies-demo/blob/eaa514147cad0376c4d0c7d9381ba81d01baebd2/lib/custom-taxonomies.php#L75

but no worries - since you've fixed the cron jobs it works fine now.


In the wp-env.json file, I moved the "." after the Gutenberg plugin, as it is a dependency. This way, our plugin is activated by default.

Nice! I didn't know about this. Let's also remove the "activate the plugin" as a required step from the README in that case πŸ™‚ .


All in all I think we can merge this and great job Mario πŸ™‚ I only have 2 comments:

  1. Let's add back the wordpress-importer plugin in the wp-env.json because it's needed for loading the XML files (already mentioned this above).
  2. One of us should temporarily (and only locally) modify the cron script to load data for a certain number of movies (not just the most popular ones). Then export this data to XML files replacing the contents of wp_sampledata_actors.xml and wp_sampledata_movies.xml. We can then also just remove the contents of lib/content-creator. But we can do all that in another PR.
michalczaplinski commented 1 year ago

I've just updated the README to remove the line where we tell people to manually activate the plugin.

SantosGuillamot commented 1 year ago

I updated the XML files with the data included in this PR, but they have to work a bit differently. I had to create three files instead of two:

WordPress importer is limited to 2MB (I think), so I only included 40 movies. Anyway, these files should only be used while working locally, so I assume that shouldn't be a problem.

The default exports by WordPress caused some issues with the urls, so I created a small script that takes care of getting the XML files ready. This way, if in the future we want to update the content, we just have to export the files, run the script, and they should be ready to work locally.

michalczaplinski commented 1 year ago

Great! Thanks Mario!

Although I notice that something's not quite right because I get some movie duplicates and also most of the movies now have like 300 actors associated with them πŸ˜… :

https://user-images.githubusercontent.com/5417266/221326949-71150f3f-5000-4879-96e1-aba516590d7e.mp4


I'm gonna investigate.

michalczaplinski commented 1 year ago

I didn't manage to debug the core issue but I can confirm that importing the data from the new XML files works fine πŸ‘ .

As the cron job is only really important for the live version of the demo and I would not expect the users to want it to run on their local machines I think we can merge this and fix the problem with the cron job in another PR. The cron will not run locally anyway unless the user gets the TMDB API key and puts it in their local .envfile.

I've also updated the README because I've found a way to import the movie data using the WP CLI.

SantosGuillamot commented 1 year ago

Although I notice that something's not quite right because I get some movie duplicates and also most of the movies now have like 300 actors associated with them πŸ˜…

I've tested it using Local by Flywheel and it seems to be working fine:

Screenshot 2023-02-27 at 10 17 44

Maybe it is related to wp-env not supporting cron events. Maybe the workaround we are using is causing issues. Anyway, as you say, users running it locally will use the XML files. So, as long as it works in the production environment, it should be fine.