nawahasan / wordpress-custom-content-type-manager

Automatically exported from code.google.com/p/wordpress-custom-content-type-manager
0 stars 0 forks source link

Storing relational data... right approach? #250

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Hi there, so the way you're currently storing relational data is through the 
use of square brackets... ie ["8","7"]

is this the right approach?

Would it be better to store the data in multiple rows as wordpress does 
nativly? That way we can query data... 

With the current approach how would we go about querying for IDs 8 or 7 without 
writing a custom query?

Thanks and really keep up the great work, pluign is getting better each version!

Original issue reported on code.google.com by b...@webcultured.com on 6 Dec 2011 at 4:55

GoogleCodeExporter commented 8 years ago
Storing data structures as serialized strings or JSON-encoded strings (as I'm 
doing) is fairly common.  Note that this happens ONLY when you are storing data 
from a "repeatable" field (e.g. a multi-select storing multiple values).  In 
version 0.9.5 (almost ready for public release), this same approach is used to 
store repeatable image, text, or relation fields. It's used for all fields that 
store an array of data (relational data or not).

I thought about the approach a long time, and maybe what I did isn't the best 
way, but I kept running into problems with WP's architecture here, so I 
consciously chose to avoid it.  The reason that I specifically avoided WP's 
approach is that it makes it nearly impossible to do certain database queries 
-- my approach makes some queries more difficult (as you outlined), but it 
becomes much easier to select an *entire post* including all of its custom 
fields in one fell swoop.  It's really hard to write MySQL queries that join 
tables on an indeterminate number of rows.  So if you have an array of custom 
field values stored as multiple rows in the wp_postmeta table, you checkmate 
yourself into requiring a separate database query for EACH custom field 
involved, whereas if you have a single row with JSON data, then you can grab 
all your custom field data via a single query that grabs the entire post data 
and all custom field data via a single query.  The approach became solidified 
as I developed the Summarize Posts plugin, which offers a more thorough 
database querying API than I could find in native WP functions.

The JSON format is actually queriable (whereas PHP serialized data is not).  
E.g.

SELECT * FROM wp_postmeta WHERE post_id=123 AND meta_key='my_custom_field' AND 
meta_value LIKE '%"7"%';

Note the quotes are included in the LIKE query.  Since the wp_postmeta table is 
not indexed, MySQL will be forced to do a table scan to find the rows (i.e. 
it's slower query) -- so in other words, there's no disadvantage to doing a 
"LIKE" query because the table needs to get scanned either way.  I'm building 
more functionality into the Summarize Posts plugin: 
http://code.google.com/p/wordpress-summarize-posts/wiki/get_posts  so you 
should be able to handle most queries you can think of.

So the other thing that made me go for the JSON approach was that WP offers no 
way of sorting the values.  When you use the get_post_meta() function to 
retrieve an array of values from the wp_postmeta table, you don't have any 
control over the order that they come back out of the array, whereas with 
approach, you can simply shuffle the input fields as they are stored, save the 
form, and back in your templates, the order of the values updates (see this in 
action in version 0.9.5). 

So yeah... I don't know.  I think I made the decision that was the lesser of 2 
evils.  Maybe it would have been better to stick to WP's approach and just suck 
up the fact that I'd need to make multiple database queries, but I've had to do 
that for some other projects (unrelated to WP), and the searching, pagination, 
etc. just got to be completely untenable if you couldn't grab all your data in 
a single query.   I'm not the only guy who's used this solution: MODX uses this 
for its custom fields, and I've spotted the same approach in parts of Drupal 
and Expression Engine, so *somebody* thinks it's viable. haha

If you're really concerned about architecture, you probably wouldn't even be 
using WordPress at all.  One huge limitation (to me) is that WP offers no 
built-in way of standardizing the custom fields, so those relationships are all 
stored in the CCTM configuration, and that makes it impossible to know which 
custom fields are available for any post or page, and this just makes the 
frustrating scenario of requiring one additional database query for each custom 
field that much worse.

Original comment by fireproofsocks on 6 Dec 2011 at 5:19

GoogleCodeExporter commented 8 years ago
Thanks for the in depth explanation, it makes sense... I'm just thinking a head 
a bit where WordPress community is pushing more into advanced queries that can 
be build through the standard API, meta_query etc those methods will become 
more powerful overtime so it's nice to conform to the way they see the standard 
going. 

I agree you lose the sorting which is a bummer, I'm not sure how if the sorting 
could be done just by insert ID... but if you look this long term, where people 
use a bigger query such as 

$args = array(
    'post_type' => 'product',
    'meta_query' => array(
        'relation' => 'OR',
        array(
            'key' => 'color',
            'value' => 'blue',
            'compare' => 'NOT LIKE'
        ),
        array(
            'key' => 'price',
            'value' => array( 20, 100 ),
            'type' => 'numeric',
            'compare' => 'BETWEEN'
        )
    )
  );

to non mysql people and even more advanced users it's just easier to use the 
API, where as if you have to write a new query every time you want to retrive 
the above data along with some additional like filtering for your plugin that's 
where it becomes more tedious to write those queries... 

Original comment by b...@webcultured.com on 6 Dec 2011 at 5:32

GoogleCodeExporter commented 8 years ago
BTW is there a function in the plugin that will pull related posts based on 
post id and meta key? That would be great, so you eliminate the need to build a 
query... rather we just say

get_related_posts($post_id,'key_name')

and we get back a standard wp_query object with those related posts....

Original comment by b...@webcultured.com on 6 Dec 2011 at 5:37

GoogleCodeExporter commented 8 years ago
I understand what you're saying: it's usually far better to follow the norms of 
any application, but in the case of WP, I feel it's fair to deviate from that.  
There are so many poorly implemented features in there, and the architecture in 
my opinion is on its last legs.  It's just a mess under the hood and if there 
isn't complete rewrite of WP in its entirety in the next few years, I would 
guess that its market share will slowly drain to systems that are easier to 
develop and maintain (just my predictions... time will tell).

Your sample query is a fair point: you'd have to do some of your filtering in 
PHP if you were filtering on the JSON encoded fields.

No, I don't have a function to get all related posts directly, but you could 
pass values to an instance of GetPostsQuery.  E.g. imagine a "repeatable" field 
that contains an array of values:

$post_ids = get_custom_field('my_relations:to_array');

$Q = new GetPostQuery();
$args['include'] = $post_ids;

$results = $Q->get_posts($args);

foreach ($results as $r) {
   // $r will contain the entire post including all custom fields in an associative array.
} 

Original comment by fireproofsocks on 6 Dec 2011 at 6:09

GoogleCodeExporter commented 8 years ago
Actually, I should write a "get_post" output filter that retrieves the entire 
post object so you could do something like this (not yet implemented):

$posts = get_custom_field('my_relations:to_array', 'get_post');

Or maybe 
$posts = get_custom_field('my_relations:to_array:get_post');

Still thinking through the api here, but that's a good feature request.

Original comment by fireproofsocks on 6 Dec 2011 at 6:14

GoogleCodeExporter commented 8 years ago
Sounds good thanks, and yes lets hope they clean things up :)

Original comment by b...@webcultured.com on 6 Dec 2011 at 6:14

GoogleCodeExporter commented 8 years ago
Perfect, yeah having something like that in place would be helpful.

Original comment by b...@webcultured.com on 6 Dec 2011 at 6:16

GoogleCodeExporter commented 8 years ago
Feature request: issue 251

Yeah, this is good conversation to have -- honestly, I wish I had had more 
input for other devs as I worked on this.  That's one thing I miss from working 
on a dev team.  

Thinking more broadly, I think it's ok to abstract the database a bit.  Some 
MVC purists argue that the developers shouldn't care how the data stored in the 
model, so all database access should come ONLY through the API.  E.g. MODX is a 
"database agnostic" platform -- that's pretty amazing that for most dev 
scenarios, you never write a single SQL query *ever*.  I don't see any 
advantage to taking WP to that extreme, but I think the Summarize Posts plugin 
and the GetPostsQuery class will allow more flexible searching over time and 
hopefully nobody will need to see the exact internals of how those custom field 
arrays are stored.  Anyway, it's an interesting conversation, and I suppose 
there's never a perfect solution.

Original comment by fireproofsocks on 6 Dec 2011 at 6:27

GoogleCodeExporter commented 8 years ago
And while I'm thinking about this (sorry I'm so OCD this morning), WP does this 
sort of thing with its image meta data: WP serializes a bunch of stuff (height, 
width, etc.) in a hidden custom field named _wp_attachment_metadata.  It's 
serialized data, so you can't search it at all, but that's much more convenient 
to have all your meta data in one spot rather than in umpteen different rows.

I guess it all comes down to searching: if you need to do complex searching, 
then you really need a database structure that supports it.  I think that's the 
approach that PodCMS takes: http://podscms.org/  -- it avoids this problem by 
adding a custom table for each post-type, but that in turn introduces other 
searching difficulties (see Drupal here): it's hard to search across multiple 
post-types.

Original comment by fireproofsocks on 6 Dec 2011 at 6:32

GoogleCodeExporter commented 8 years ago
Lol no worries, yeah it's all about sacrifices and compromises... I know that 
some of that image data is stored that way... but you typically wouldn't search 
images that often in that way... however with relational data it's usually 
there to be displayed or used in some way during rendering... :)

I'm sure we could go back and forth on this for ages but in the end not a huge 
deal just have to deal with it the way you intended :)

Original comment by b...@webcultured.com on 6 Dec 2011 at 6:40

GoogleCodeExporter commented 8 years ago
I've pasted this into a wiki page: 
http://code.google.com/p/wordpress-custom-content-type-manager/wiki/CustomFields
DataStructure

It needs some cleanup, but it's worth documenting in the wiki for future 
reference.

Original comment by ever...@fireproofsocks.com on 20 Dec 2011 at 5:42

GoogleCodeExporter commented 8 years ago

Original comment by ever...@fireproofsocks.com on 20 Dec 2011 at 5:42