pods-framework / pods

The Pods Framework is a Content Development Framework for WordPress - It lets you create and extend content types that can be used for any project. Add fields of various types we've built in, or add your own with custom inputs, you have total control.
https://pods.io/
GNU General Public License v2.0
1.07k stars 265 forks source link

Moving from Custom Post Types for Pods, Fields, and Groups to a Options Query API, avoiding pitfalls of common issues with large sites with huge post/meta tables #1880

Closed sc0ttkclark closed 2 years ago

sc0ttkclark commented 10 years ago

We currently store Pod, Field, and Group (Pods 3.0) configurations as Post Types with meta.

This has it's benefits:

Cons:

Now, cons list is kinda small, but both lists could be expanded with many more items here. The big gruesome con is the first con in the list, though, and carries more weight than any of the other benefits combined. This is a big problem, it doesn't affect most sites, but it affects large sites and databases in a way that just cancels out using Pods for them.

We should explore the possibility of migrating our object data into the options table, it's never too large, and is like using transients at the same time, since the data would be serialized. I'd opt to use JSON here though, but that's for another decision, if we go forward with this.

Pods 3.0 is in development, so if we're going to make this change, we should be really considering doing it now as opposed to later.

Here's what we need to figure out:

sc0ttkclark commented 10 years ago

To clarify, we would store each Pod as an option row, each Field as an option row, and each Group as an option row, mirroring what we currently do with them in post data, except instead of storing meta in their own rows in a meta table, it's json or serialized as the option_value.

sc0ttkclark commented 10 years ago

Another benefit in using options is that our pod data API could be adjusted to support network-wide Pods at ease, simply by changing the table/fields used, since there are network-wide option tables and site-specific options tables. This would be a pretty killer benefit!

mgibbs189 commented 10 years ago

Postmeta

Serialized options

DavidCramer commented 10 years ago

For me, I have always preferred the options system over custom post type. You can see this in DB-Toolkit and recently, caldera engine.

benfavre commented 10 years ago

Nice to see this brought up. The options table just makes sens in terms of the type of data that gets stored. Using a naming convention for the keys is probably the way to go, removing the need to query for the unique id (think that's how you's do it). Also using a naming convention would allow it to be pretty straight forward to add and modify values manually, or via other external tools.

mpeshev commented 10 years ago

I think it's worth mentioning (roughly) how much data (in terms of entries) is stored on average for a given number of Pods, Fields, Groups etc? This could also be broken to small sites up to very large ones with tons of data. This would facilitate a better discussion overall.

I am personally against serialized data for storing multiple values in most cases - it is more CPU-intensive, and it's quite hard to analyze and update, mass replace values etc.

Network options are helpful for network-wise operations, but how useful are they in practice for multiple sites? It's also good to know that this would affect every single site in a multisite too, whenever it is browsing through any distributed options.

I would consider a more optimal way to store data in the posts/meta tables. That's what the tables are meant to do, they provide an API to fetch/read/query data, and this would be valid at least for the next several versions. The options table had some performance issues prior to 3.7 (when the core team decided to clean transients on update), therefore some gotchas occur in these tables as well. Implementing that in custom tables would increase the number of tables per site + additional API required for CRUD operations.

LoreleiAurora commented 10 years ago

+1 for using the options table. I will make a big donation if you do this and enable network wide content types.

mikevanwinkle commented 10 years ago

So I see a general conflict here between making sure Pods scales and trying to keep within the wordpress core functionality. From a scaling perspective, a separate table with field ids that can be indexed is definitely the way to go IMAO ... but that takes you back in the direction you came from adding back in the wp_pod_fields table, which you may not want to do.

But think about it this way ... pods gives users the ability to extend content types by using custom tables. But isn't a "Pod" just an extended content type worthy of it's own table when it needs to scale?

Just in my experience at WP Engine, large options are just as much of a problem slow queries and I could see a serialized array getting very large on a pod with dozens of fields, and then you gotta keep all that in memory during the whole post edit page.

Plus serialized data is insecure in that it is very easy to corrupt.

I'd stick with postmeta approach and try and find ways to improve the queries by either moving more info into the posts table so it can be retrieved quickly via post_parent = # queries

sc0ttkclark commented 10 years ago

@mikevanwinkle just to avoid confusion, the idea isn't to store the whole Pod and it's settings as one serialized array, but just the Pod's settings itself like labels, etc. Fields would be their own separate individual options, so would Field Groups.

unknownnf commented 10 years ago

I'd like to first say the wp_podsrel table is just as problematic as this issue. First if you do need to query for pods rel even on two small related pods, it will end up sifting through a huge table(because all relationships are stored in the same table). For this I opened an issue a year ago to have each pod have it's own wp_podsrel table, this is much more memory efficient(since you don't have to store the source pod id and indexes will not be as large) and easier to scale. Since we are on the topic of 3.x there is no better time than now to do that change, there is no real benefit in storing all relationships to all pods in the same table, it will just use up more space and memory than individual tables do.

Now for the moving of field data to options, I think that's where it should be stored, but this is probably a per site or better said a per data structure issue so as an alternative option we could abstract the whole way the data of a pod is accessed and provide both options to be used and / or even the option to extend into other methods of storing pod/group/field data.

For the memory concerns, just keep in mind that the amount of data still will not change, not even the traffic between the site / database will, and neither will there be more memory needed to store the data while the interpreter is running, any difference will be negligible.

In security of the data, stuff could always go wrong in both cases, data in mysql is quite easy to corrupt, now if something is corrupted we may not even know about it(if a field name changes in the database for example, the pod will not find it). What I would suggest for 3.0 is a way to load the data, check if everything is ok, and then carry on. This could be done with some custom objects, hydrating them with our data and seeing if everything is ok, if not there could be ways to fix the corruption automatically or we can alert users that are running debug or we can just revert to default values on the fields.

pglewis commented 10 years ago

it's quite hard to analyze and update, mass replace values etc.

This is a very good point and I'm not sure off-hand how big a deal the mass replace issue could turn out to be (from my currently quite lofty high-level viewpoint). But keep in mind we're talking about internal pod/field meta here, so the inconvenience should be limited to the plugin core and not the general public.

sc0ttkclark commented 10 years ago

I would love to use the options table for our purposes here, I'm just not sure we can do this super efficiently, since option_name is a dang varchar(64)! Other areas of WP like usages of meta_key are much easier to work with at varchar(255). Either way, storing the pod name and the field name together in the option_name won't be possible, we'll have to figure some smart way around it to make this work.

pglewis commented 10 years ago

@sc0ttkclark is this out of scope for 3.0?

sc0ttkclark commented 10 years ago

I'd really like to tackle it this release if possible. Needs some additional contributor eyes though, to see what's appropriate for a solution -- store in options?

benfavre commented 6 years ago

<3

sc0ttkclark commented 2 years ago

Moving forward with this in other ways now that Pods 2.8 is out and includes the big refactor on how Pods fetches configs. More effort will need to be done there but I'm not sold on options being the best place for this either.