wp-e-commerce / WP-e-Commerce

WP eCommerce - The most popular independent eCommerce platform for WordPress
https://wpecommerce.org
GNU General Public License v2.0
215 stars 216 forks source link

WPSC meta caching still needs implementing #400

Closed benhuson closed 11 years ago

benhuson commented 11 years ago

Just noticed this note in the code. https://github.com/wp-e-commerce/WP-e-Commerce/blob/master/wpsc-includes/meta.functions.php#L15

JustinSainton commented 11 years ago

Related, #100, #153.

JeffPyeBrook commented 11 years ago

@benhuson Ben would you mind if I jumped in to help on this? I have something that is covers the points in #100, #153. and is close to test.

JeffPyeBrook commented 11 years ago

@benhuson @JustinSainton I just quickly typed up an interface for the meta class object with caching that I am finishing up. If you are amenable I would appreciate a review to be sure it meets your expectations and the points documented in the related posts #100, #153.

The class was based on the purchase logs class,with the necessary adjustments to support the more arbitrary nature of what might someday be stored as meta by WPEC or plugins.

Please note the comments below the class definition related to the default configuration parameters. I firmly believe that default parameters should work well in 99.99% percent of apps. Every time I have said that in the past something comes up and they need to be changed. So they can be overridden. As far as the cache goes, I am testing using MEMCACHED and MEMCACHED+APC Process Persistent, Not Shared. That should cover much of the install base.

interface WPSC_Meta_Interface
{
    /**
     * Constructor for the meta object. If no object_type+$object_id is passed, this simply
     * create a new empty object. Otherwise, this will get the wpsc meta from the
     * DB / cache by using object type and object id
     *
     * Eg:
     *
     * // get wpsc cart meta collection with ID number 23
     * $log = new WPSC_Meta( 'wpsc_cart', 23 );
     *
     * @access public
     * @since 3.9
     *
     * @param string    Optional object_type object type of meta
     * @param int       Optional object_id object type's id
     */
    public function WPSC_Meta( $object_type = false, $object_id = false );

    /**
     * Deletes wpsc meta from the database and cache - must minimally specify
     * $object_type+$object_id or $meta_id.   When specifying $object_type+$object_id
     * without meta_key, all meta for $object_type+$object_id will be deleted.
     *
     * @access public
     * @static
     * @since 3.9
     *
     * @param string    Optional object_type object type of meta
     * @param int       Optional object_id object type's id
     * @param meta_key  Optional meta key of specific meta value
     * @param meta_id   Optional meta_id of specific meta item
     * @return void
     */
    public static function delete(  $object_type = false, $object_id = false, $meta_key = false, $meta_id = false );

    /**
     * Returns the value associated with specified meta key
     *
     * @access public
     * @since 3.9
     *
     * @param string $meta_key Name of the meta value
     * @return mixed, returns empty string when item does not exist
     */
    public function get( $meta_key );

    /**
     * Sets a meta value. This function accepts a meta key and a value
     * as arguments, or an associative array containing key value pairs.
     *
     * @access public
     * @since 3.9
     *
     * @param mixed $meta_key Name of the property (column), or an array containing key
     *                   value pairs
     * @param string|int $value Optional. Defaults to null. In case $key is a string,
     *                          this should be specified.
     * @return WPSC_MEta The current object (for method chaining)
     */
    public function set( $meta_key, $value = null );

    /**
     * Whether the value for this meta exists - if empty string is passed returns
     * whether any meta values exist
     *
     * @access public
     * @since 3.9
     *
     * @param string $meta_key Name of the meta value
     * @return bool True if it exists. Otherwise false.
     */
    public function exists($meta_key='');

}

/*
 * One shouldn't need to adjust these cache parameters under normal circumstances.  If an 
 * environment or WPEC application needs to change the cache behavior they can be set 
 * in a configuration file.  The parameters will change the behavior and performance of 
 * the WPSC_Meta cache to fit the capabilities of your environment.
 */

if ( !defined( 'WPSC_META_CACHE_TIMEOUT') )
    define ( 'WPSC_META_CACHE_TIMEOUT' , 3600 );

if ( !defined( 'WPSC_META_CACHE_MAX_SIZE') )
    define ( 'WPSC_META_CACHE_MAX_SIZE' , 16384 );

if ( !defined( 'WPSC_META_CACHE_PREFETCH_SIZE') )
    define ( 'WPSC_META_CACHE_PREFETCH_SIZE' , 512 );

/*
 * Because WordPress cache can come in many flavors and capacities two parameters are provided
 * that influence how the WPSC_Meta class utilizes the wordpress cache.  Adjusting the
 * parameters allow the developer to impact data consistency and performance.
 *
 * ==== CACHE PERSISTENCE ====
 * Is your cache shared across all processes that can service requests? If your cache is 
 * not shared, or only shared among some processes, as would be the case for some cluster 
 * installations? Set theWPSC_META_CACHE_TIMEOUT very low or even to 0 to avoid data 
 * consistency issues.
 *
 * When tuning these parameters consider your application's use of WPSC_Meta and your cache 
 * implementation. If your WPEC application doesn't do anything unusual with WPSC_Meta values, 
 * the values set are very long lived, and you are not using the meta features to store a state 
 * that changes frequently, long cache timeout should work well with a long time out value.
 *
 * Cache implementations can be long-lived persistent, persistent only as long as 
 * a process (like fast-cgi) is alive, or not persistent after the request is processed.  
 * Based on your platform configuration you may want to adjust the cache timeout and maximum 
 * size so avoid the overhead of caching meta that will never be accessed from the cache.
 *
 * ==== META CACHE PREFETCH ====
 * On a first request for a cache_object/cache_id pair, the meta cache will pre-fetch the 
 * requested and related cache values that are below the size defined by 
 * WPSC_META_CACHE_PREFETCH_SIZE. Meta values that are bigger than WPSC_META_CACHE_PREFETCH_SIZE 
 * bytes will be fetched only when requested. Depending on how your cache is 
 * accessed (mapped memory, tcp-ip/, file, etc) and the cache capacity, you may want to 
 * raise or lower the WPSC_META_CACHE_PREFETCH_SIZE parameter. 
 *
 * Adjusting WPSC_META_CACHE_PREFETCH_SIZE could also improve performance when there
 * are many WPSC_Meta cache items.  Meta for each object_type/object_id pair is stored into 
 * and retrieved from the meta cache as a group. Most standard meta values are short strings 
 * or numbers.  When the size of all of the meta values for an object_type /object_id pair 
 * fit into one request some cache implementations (MEMCACHED is an example) may be 
 * significantly faster.
 *  
 * ==== CACHING LARGE VALUES ====
 * It is possible for themes or plugins to store very long values as meta.  The meta values 
 * could be anything, files, pictures, html.serialized arrays, whatever.  Caching and retrieving 
 * very large values can be expensive, and may be more than retrieving the information from 
 * a database. Even if the storage or retrieval is not expensive, storing long values into 
 * a cache may significantly impact cache performance. 
 * 
 * The WPSC_META_CACHE_MAX_SIZE defines a threshold over with a value will not be cached.
 *
 */

'''
JeffPyeBrook commented 11 years ago

One additional question, Would people like a 'search' method implemented for this class? It might make the meta a useful and efficient and persistent place to store information like visitor cart information?

search ( $object_type, $meta_key, $meta_value ) would return a meta object.

Thoughts?

JustinSainton commented 11 years ago

I'm not entirely sure I'd want to see this implemented as an interface, rather than an abstract class. I understand the differences, but I'm not sure what the benefit would be here. Unless the purpose is just to quickly illustrate the proof of concept while you're working on the class - that's totally cool.

I don't particularly envision a great need for a search() method. I'd imagine that we'd follow the WP_Cache model in core fairly closely, where you'd have a *_cache_get() method to get the data.

Thanks for all the work so far Jeff, I'm really looking forward to seeing what you come up with!

garyc40 commented 11 years ago

Just want to add that WordPress meta API allows extension to use custom meta table, which I believe is the best way to go, rather than implementing our own standards and database tables. This means if we choose to go with WordPress meta API, we need to gradually deprecate our meta.functions.php.

Here's how the core WP meta API works.

The magic happens in function get_metadata().

The more familiar get_post_meta() function is just a wrapper for this core meta function, in which it passes post as the $meta_type.

The metadata will be fetched on line 285, with its results automatically cached so that subsequent calls won't query the database. So let's look at the update_meta_cache() function.

This function determines the column containing the object ID by using the $meta_type. Remember that for posts, $meta_type is post so the resulting column name would be post_id. Let's say if we want to use this core API for carts, and pass in the $meta_type as cart, we'll need to make sure our custom meta table has a column called cart_id.

The update_meta_cache() function also determines the database table to query by calling the _get_meta_table() function. That function only returns the table name as $wpdb->({$meta_type}meta). So for our cart meta example, we should assign $wpdb->cartmeta to whatever name we want to use for the cart meta table.

So what does all this mean? This means, we can write wrappers for our custom meta API:

function wpsc_get_cart_meta( $cart_id, $key = '', $single = false ) {
    return get_metadata( 'cart', $cart_id, $key, $single );
}

function wpsc_update_cart_meta( $cart_id, $key, $value, $prev_value ) {
    return update_metadata( 'cart', $cart_id, $key, $value, $prev_value );
}

Also need to do this upon as soon as WPEC is loaded:

global $wpdb;
$wpdb->cartmeta = 'cart_meta';

The next step is to write functions to create a table called "cart_meta" that mimics the structure in the "postmeta" table, except that the post_id column should be named cart_id instead.

You can see that by using the core meta API we only needed to write 2 wrapper functions, assign a property to $wpdb, and create the meta table. Everything from caching, fetching, updating is already taken care of by WordPress. This requires minimal effort when compared to writing our own meta class.

As a result I strongly believe we should deprecate our existing meta functions and gradually migrate to using the WordPress standard way.

JustinSainton commented 11 years ago

That's a good call - I think I thought through that option awhile back but got stuck on the idea of shoehorning our current meta table in for it, and if I recall correctly, I don't think that'd be feasible? I didn't go too far down that route though. Would your suggestion be that we add both cart and category meta tables to replace the current meta table?

The only difference (and what I got hung up on in the past with this route) is that we have the object_type field on top of the other 4 standard fields - not to mention the fact that we'd now have to worry about back compat between a deprecated table and a new table. Not something impossible to do (obviously, since we've done it) - but that's a slow process, not without it's bumps.

garyc40 commented 11 years ago

We just need to write a migration routine and be done with it:

Of course any database migration code needs to be written with great care and tested thoroughly. If we get through this hurdle the grass will be much greener on that side of the fence. It can all be done in one release cycle.

This won't break compat completely with plugins that are querying wpsc_meta directly. The reason being the wpsc_meta table will still be there, just that it won't be updated any more.

Because of the breaking-compat nature of this change, we're better off scheduling this in the same release (or later) as the theme engine in 3.9.

garyc40 commented 11 years ago

As an aside, this API would be a better foundation for our customer meta data than transients.

JustinSainton commented 11 years ago

I believe the only two active object types are category and cart (though there is some internal conflation between cart and cart_item - not sure whether or not there would be benefit to both)

garyc40 commented 11 years ago

I think I saw variation terms and downloadable products having meta stored there as well.

JustinSainton commented 11 years ago

Ah, right - seems to me that we should have a single table for terms (category, variations, whatever). Honestly - it's a big pill to swallow to add any additional custom tables at this point. It looks like there are a handful of object_types that only exist within the updating-functions.php file.

There are others used for sort_order meta on categories and variations, which is odd, and hopefully easy enough to mitigate by using term_order.

meloniq commented 11 years ago

...Also need to do this upon as soon as WPEC is loaded:

global $wpdb;
$wpdb->cartmeta = 'cart_meta';

Related: https://github.com/wp-e-commerce/WP-e-Commerce/issues/197#issuecomment-13329838

Separate ticket?

garyc40 commented 11 years ago

Yes a separate ticket would be great @meloniq :)

benhuson commented 11 years ago

+1 for using WordPress get_metadata() as Gary suggests.

Obviously this gives that advanced of the built-in WordPress object caching methods which in turn can be hooked into to use other caching implementations via plugins etc.

I guess that it is configurable enough that you could prevent certain type of meta from being cached if needed (particularly if using a caching plugin to extend cache expiry) - thinking possibly cart data in certain circumstances etc. I don;t really know much about this though.

garyc40 commented 11 years ago

@JustinSainton : Yes this is not easy. But I'm sure it's worth doing eventually rather than committing to our custom method and then having to write our own code to handle caching and sanitizing.

The meta tables are technically not custom tables because it's still a WordPress core data structure. It still follows WP core standards. Not to mention in some cases using custom tables is an absolute necessary evil.

Having separate meta tables might even help with performance as it acts as a way to "horizontally partition" the metadata. At least it would separate the numerous cart item meta entries from the handful of category meta entries.

garyc40 commented 11 years ago

@benhuson We can certainly expire meta values by adding a column to that particular meta table to store dates and then run cron that clears outdated entries.

Or we can get super-meta by having a meta table storing meta information of cart meta values themselves :)

JeffPyeBrook commented 11 years ago

@JustinSainton said:

I'm not entirely sure I'd want to see this implemented as an interface, rather than an abstract class... Unless the purpose is just to quickly illustrate the proof of concept while you're working on the class - that's totally cool.

Illustrating the api was exactly the intent, no need for the interface in the actual implementation

JeffPyeBrook commented 11 years ago

@benhuson said

We can certainly expire meta values by adding a column If we wanted to implement meta data that expires I would do it at the application level and let it apply to a collection of one or more meta items.

I'm thinking about wpsc_customer_meta as an example. Save the wpsc_customer_meta properties as meta instead of as a transient. This would be fast, efficient, cached when possible, and most importantly persistent. Application can add a meta property wpsc_customer_meta_timestamp and set it at the same time the application updates the meta data.

In the cron that runs to delete old customer meta you would be a function that looked something like:

$wpsc_expired_metas  = WPSC_Meta::search(  'wpsc_customer_meta','wpsc_customer_meta_timestamp < '. time()-3600 );

foreach ( wpsc_expired_metas  as wpsc_expired_meta  ) {
     wpsc_expired_meta->delete();
} 
JeffPyeBrook commented 11 years ago

Woke up this morning pumped to give a status report on being very close to done with the meta class and testing, then I read @garyc40 's really interesting suggestion about using the wordpress metadata infrastructure. Head is spinning a little.

The class with caching is working well, caching a works well, all appearances is that it is going to be very fast in implementation. I have the class in a very basic plugin with an admin screen that can be used to explore existing meta and add/change/delete items. I can make it available if anyone wants to play.

While testing I decided that the methods that access the meta need to be made a little more consistent. All set/get/delete operations should be able to be used in any of the three methods: 1: (object id, object type, meta key ) 2: (object id, object type) and an array of meta keys 3: a unique meta id

I am updating the class to do this.

After reviewing the data in the wpsc meta table I would also suggest that we make the index on object id, object type and meta key unique. It does improve performance. I did not identify any places in the current code that would store two different values with the same key.

I would also suggest that we index the meta value column on the first 20 characters of data. This would improve performance for queries against meta values. On our test site we have 2333 meta entries, the cardinality of this meta value index is 588. That's pretty good for unstructured data

JeffPyeBrook commented 11 years ago

@garyc40 's wordpress meta data suggestion is really interesting, and the advantages are well outlined above, so I won't reiterate them here.

The meta class in it's current state may also have some advantages that should be considered by the group before we change our approach, I have noted some that I think might be worth considering below.

1) The WPSC_Meta class gives the option of handling an objects meta attributes as a collection without creating wrappers at the application level. WordPress caches the options as a collection, but gives them back to the application one at a time.

2) The caching in the WPSC_Meta class is sensitive to the size of the meta value. The class caches all attribute information, but doesn't try to cache all values. This makes the use of very large meta values practical. I can envision use cases where digital assets like files, documents and pictures might be stored using meta values.

3) As noted above. we only need one table for all of our meta. This means small meta data sets won't need to have individual tables. We won't have to think about import/export routines for each meta type table.

4) Application developers can create their own meta object types without having to worry about underlying database structure.

5) Some less frequently used WPEC, plugin and theme options and state data could be easily stored as meta rather than the wordpress options table. This can have a really positive impact on site performance. Data stored in the options table is often marked as pre-load and can make makes wordpress startup slow.

JeffPyeBrook commented 11 years ago

@JustinSainton wrote:

I believe the only two active object types are category and cart (though there is some internal conflation between cart and cart_item - not sure whether or not there would be benefit to both)

In the meta table I have both cart and cart item, and also have wpsc_purchase_log. But I am not sure if the wpsc_purchase_log is there because I am a version behind.

If we were to go the distinct table route these are the items I think we would need to consider initially having meta tables for:

wpsc_categories wpsc_variations wpsc_cart wpsc_cart_item wpsc_customer and maybe wpsc_purchase_log ???

If we go this route, for consistency it might be worth considering wpsc_product, but I would have to believe that would be a backwards compatibility disaster.

JeffPyeBrook commented 11 years ago

I just thought of another possible consideration related to @garyc40 's suggestion about using the wordpress meta infrastructure. This is something I came across while testing the class and is touched on in my code comments.

As is noted, Wordpress takes care of caching the meta data. The presumption appears to be meta data is long lived. A problem occurs If we want to allow wpsc object meta data that changes more frequently than never.

Consider a common wordpress installation on a shared host that is using APC as a cache. Many hosts in shared and nonshared environments do not permit the cache to be shared across processes. On a small server there might be 4 or 5 PHP processes serving requests, each PHP process will have it's own cache mapped within it's memory space.

This gives very real performance improvements, but also means that it is possible for some data to be in one process that is not in another. This isn't a problem if you add new meta, because when a different process looks for the new meta it loads it into memory on-demand. Wordpress post meta is typically long-lived and static, so it's never a problem.

This is exactly why some folks (including me) had a problem on the previous point release that introduced wpsc_customer_meta being stored in transients. Depending on which PHP process served the request the user would see different data.

I don't think we have a choice with wordpress meta caching, it's either on or off. And I don't think we have easy influence over the timeout of cached meta values. I have tried to address the problem in the class by allowing meta caching behavior to be configured independently of whether or not a wordpress cache is present.

If we want to store non-static data as wpsc object meta we will have to do some additional work around the wordpress meta caching to make it work across all environments.

With wordpress especially, we always need to remember that cache doesn't imply shared or persistent.

1bigidea commented 11 years ago

+1 for purchase_log (meta table)

I am wrapping up a simple margins calculator and need to capture the "cost" for purchased items (at time of purchase).

On May 14, 2013, at 5:02 AM, SparkleGear notifications@github.com wrote:

@JustinSainton wrote:

I believe the only two active object types are category and cart (though there is some internal conflation between cart and cart_item - not sure whether or not there would be benefit to both)

In the meta table I have both cart and cart item, and also have wpsc_purchase_log. But I am not sure if the wpsc_purchase_log is there because I am a version behind.

If we were to go the distinct table route these are the items I think we would need to consider initially having meta tables for:

wpsc_categories wpsc_variations wpsc_cart wpsc_cart_item wpsc_customer and maybe wpsc_purchase_log ???

If we go this route, for consistency it might be worth considering wpsc_product, but I would have to believe that would be a backwards compatibility disaster.

— Reply to this email directly or view it on GitHub.

JeffPyeBrook commented 11 years ago

For my internal testing I packaged up the meta class and test harness in a standalone plugin.

The plugin has unit tests and a browser that lets you view and change the wpsc meta data using the api. It also has a page that let's you browser MEMCACHED to see what is changing as items are set and deleted.

If anyone wants to check it out for testing or for examples of how to use the class I will make it available at this public link.

https://dl.dropboxusercontent.com/u/7214376/WPSC_Meta_Test.zip

JustinSainton commented 11 years ago

Thanks Jeff!

I really am on the fence about it all - you've obviously put in lots of time and testing to this, and I'm super hesitant (like, SUPER hesitant) to add 4-6 new custom tables.

That said, I naturally appreciate the ability to inherit all of the internal Meta (and by proxy, object caching) API from WordPress. And either way we go - when WordPress adds a core term meta table, it might render our current (or potential 4-6 new) tables obsolete - and with a rather clean break, to boot. There may be some core taxonomy API enhancements in 3.7 - but I won't ever get my hopes up for a core term meta table until I see the commit :) So one of these options definitely needs to happen.

Frankly, I actually see the "easier" path being adding the extra tables, writing the core data migration routines and using the core WP APIs. I'm just not convinced it's the best option. (And I do find the argument that "because the new tables would follow core data structures, they aren't technically custom tables" to be a bit silly.) That said, I'll defer to the consensus on this one. And regardless of what path is taken, I agree with the assessment that this is the proper place for customer meta.

JeffPyeBrook commented 11 years ago

I'll get the pull request together so that regardless of the approach we take whatever work that has been done is memorialized somewhere useful.

There a few reasons why I think the WPSC_Meta class abstraction should be the preferred method, at least for the near future. First is how easy it is for a developer to create new object type inside their WPEC app or plugin. All without having to understand the core wordpress meta functions. I can think of so many places where this would be useful. Second is that the WPSC_Meta abstraction appears to be more respectful of overall performance and cache freshness when caching large meta values than the wordpress .

Without putting the meta class out there I can't say if a year from now we will find thousands of people making use of it, or just a few of us.

It could be that both approaches are correct. There is a possibility that the WPSC_Meta class abstraction is a good one, but where Wordpress is going we may want to eventually change the underlying implementation to distinct tables. It feels like it would be easy enough to do with the new WPSC_Meta class as the interface. That means that if we move the meta access though the class it can only make the eventual upgrade easier.

There is also no reason why the class can't become aware of certain object_types. That would mean if we had a cart item table and a purchase log table, requests for those tables could be serviced through the wordpress meta api, all other requests could go to the the wpsc meta catchall table. We could avoid creating tables that might have very few rows, or would not be efficient for the data that they might contain.

I also have the impression that the wpsc_customer_meta object storage issue is a serious one. I know we want to make more use of the class but storing the data as transient doesn't work well on a heavily traffic site unless object caching is enabled. And storing the data in an object cache that is not persistent, and not backed by a persistent store, is causing issues like loss of cart contents. Either approach would fix the user meta issue. But we probably need to do something.

My sites and plugins will work either way, the problem I have is my sites and my option pricing plugins that I want to offer for free and purchase have become a slave to getting something working for saving cart item meta. Whichever way we decide to go will ultimately work out, it would just be great to make a design decision and follow through on it so the functionality is ready in plenty of time for the targeted release.

instinct commented 11 years ago

Random thought. Could this stuff be something a user must activate under settings, i.e.

Caching Speed up your site something chronic. Be warned this feature adds additional tables to your site but don't worry because turning it off will remove them.

Turn Cache on


Adding more tables isn't something we want to do by default because things can go wrong. And for all the other reasons people have discussed.

What do you think Gary?

Also it should pass the glue plugin test - which is a philosophy that Gary has been following when developing new payment gateway and theme architecture. Gary and Justin can both talk more about this :))

Best, Dan

Sent from my iPhone

On 15/05/2013, at 7:41 AM, SparkleGear notifications@github.com wrote:

I'll get the pull request together so that regardless of the approach we take whatever work that has been done is memorialized somewhere useful.

There a few reasons why I think the WPSC_Meta class abstraction should be the preferred method, at least for the near future. First is how easy it is for a developer to create new object type inside their WPEC app or plugin. All without having to understand the core wordpress meta functions. I can think of so many places where this would be useful. Second is that the WPSC_Meta abstraction appears to be more respectful of overall performance and cache freshness when caching large meta values than the wordpress .

Without putting the meta class out there I can't say if a year from now we will find thousands of people making use of it, or just a few of us.

It could be that both approaches are correct. There is a possibility that the WPSC_Meta class abstraction is a good one, but where Wordpress is going we may want to eventually change the underlying implementation to distinct tables. It feels like it would be easy enough to do with the new WPSC_Meta class as the interface. That means that if we move the meta access though the class it can only make the eventual upgrade easier.

There is also no reason why the class can't become aware of certain object_types. That would mean if we had a cart item table and a purchase log table, requests for those tables could be serviced through the wordpress meta api, all other requests could go to the the wpsc meta catchall table. We could avoid creating tables that might have very few rows, or would not be efficient for the data that they might contain.

I also have the impression that the wpsc_customer_meta object storage issue is a serious one. I know we want to make more use of the class but storing the data as transient doesn't work well on a heavily traffic site unless object caching is enabled. And storing the data in an object cache that is not persistent, and not backed by a persistent store, is causing issues like loss of cart contents. Either approach would fix the user meta issue. But we probably need to do something.

My sites and plugins will work either way, the problem I have is my sites and my option pricing plugins that I want to offer for free and purchase have become a slave to getting something working for saving cart item meta. Even though there are only a few lines of code that have to change, the risk of updating my sites to use any of the great new features and bug fixes everyone has worked so hard on is too great until there is an answer to the cart item meta question.

— Reply to this email directly or view it on GitHub.

JeffPyeBrook commented 11 years ago

Interesting thought, but might presume that users know more than they really do. Many of the of the one-click wordpress installs probably use server default settings. I also have heard that an upcoming release of php will have APC enabled by default. So caching capability might be really nebulous.

Also, the caching issue isn't the same issue as multiple tables. The way I am thinking about the tradeoff is:

Choice 1: Use multiple tables and wordpress api means we don't have to maintain code and performance will be a known quantity, and the api is limited to what wordpress provides Both good things.

Choice 2: Managing the data ourselves means we have to maintain the code but we get to tweak the performance and get the flexible api we want.

Pretty much the tradeoff we have to make whenever we build anything new.

benhuson commented 11 years ago

Is there any reason why using the WordPress API means using multiple tables? I notice the get_metadata() function is filterable with get_{$meta_type}_metadata Does that not mean that that it's possible to store in a single table, or however you want?

JustinSainton commented 11 years ago

A single table without the object_type parameter would result in object_id collisions. (E.g. category meta having the same ID as a purchase log, or some such thing)

garyc40 commented 11 years ago

@SparkleGear : Thanks for your contribution. I do think it's best if we explore all the solutions, so I really appreciate your effort.

1) The WPSC_Meta class gives the option of handling an objects meta attributes as a collection without creating wrappers at the application level. WordPress caches the options as a collection, but gives them back to the application one at a time.

Caching the whole collection is the point. It means only one single SQL query is required. If you cache each meta one by one upon retrieval, it will be highly inefficient.

2) The caching in the WPSC_Meta class is sensitive to the size of the meta value. The class caches all attribute information, but doesn't try to cache all values. This makes the use of very large meta values practical. I can envision use cases where digital assets like files, documents and pictures might be stored using meta values.

It's usually not a good idea to store blobs into the database. I might need you to highlight the portion of the code in your pull request so that I could understand your point about meta value size more easily.

3) As noted above. we only need one table for all of our meta. This means small meta data sets won't need to have individual tables. We won't have to think about import/export routines for each meta type table.

It's not a bad thing for small data sets to have their own tables. This is a good thing for extensibility, because we have no idea how third party plugins will use metadata. They might add a lot of metadata so we can't say anything for sure about the size of data sets. Import/export routines are actually much easier since we won't have to JOIN between tables on the column object_type.

4) Application developers can create their own meta object types without having to worry about underlying database structure.

This is a good point. But we need to consider how often this happens. We need to consider the tradeoff between additional code (and potential bugs) vs. flexibility.

5) Some less frequently used WPEC, plugin and theme options and state data could be easily stored as meta rather than the wordpress options table. This can have a really positive impact on site performance. Data stored in the options table is often marked as pre-load and can make makes wordpress startup slow.

This is true for customer meta data. However I don't think autoloaded option makes WordPress startup slow. The autoload column is indexed so retrieval of the rows would take minimal amount of time. What it really affects is system memory since all of these options are cached.

As is noted, Wordpress takes care of caching the meta data. The presumption appears to be meta data is long lived. A problem occurs If we want to allow wpsc object meta data that changes more frequently than never.

We need to write our own code that invalidate metadata cache when certain things change. The same thing apply to your proposal as with WP metadata API.

With wordpress especially, we always need to remember that cache doesn't imply shared or persistent.

This is not a problem of WordPress. It is more an issue of server configuration. APC doesn't share cache between processes, this is a feature request.

Now, the issue with customer meta is strictly the use of transients, not the use of WP cache operations. When APC is enabled, customer meta is directly stored in persistent cache rather than going through the database layer. Since each process has a different persistent cache, this is not shared. I can see how that is a problem. However, if we have a separate table for customer meta, then this is no longer an issue, since we will only cache whatever is in the database. So all processes will share the same data, and the cache only reflects whatever there is in the database.

So in short, cache sharing is not an issue with normal WP cache operations. It is the use of transient that is problematic.

Frankly, I actually see the "easier" path being adding the extra tables, writing the core data migration routines and using the core WP APIs. I'm just not convinced it's the best option. (And I do find the argument that "because the new tables would follow core data structures, they aren't technically custom tables" to be a bit silly.)

@JustinSainton, I don't understand the irrational fear of adding database tables. For a complicated application like e-commerce, it is absolutely necessary. The only issue with custom data tables is the cleaning up after the plugin is uninstalled. I don't think WPEC currently cleans up after itself anyways. Please explain to me why you think additional tables are bad.

Now about the "custom table" argument, let me expand what I mean by "not technically custom table". Let's look at the purchase log database. All the fetching, caching, deleting, updating has to be done via our own SQL queries. Operations on the meta tables, on the other hand, needs no custom SQL queries, as it follows a convention set by get_metadata(). These are basically extensions of the postmeta or usermeta tables, just with different data types. So we can't lump them all in the same bags as the likes of purchase_log or submitted_form_data. Hence I said it's not technically custom tables. They're extension of core data structure, rather than a new data structure, hence, they're not custom in my book.

First is how easy it is for a developer to create new object type inside their WPEC app or plugin. All without having to understand the core wordpress meta functions.

I concede that this is a good point. However we need to find out first how often do developers create new object types?

Is there any reason why using the WordPress API means using multiple tables? I notice the getmetadata() function is filterable with get{$meta_type}_metadata Does that not mean that that it's possible to store in a single table, or however you want?

We can use that filter to fetch data from our existing wpsc_meta table. However, this bypass the default cache handling that WordPress does for us (which happens a few lines below), so using that filter defeats the purpose unfortunately.

JeffPyeBrook commented 11 years ago

@garyc40 Thanks for taking the time for the feedback. Whichever way we go i'll help get it done.

It's usually not a good idea to store blobs into the database. I might need you to highlight the portion of the your pull request so that I could understand your point about meta value size more easily.

I handle it here in the fetch function, all of the keys are returned, but the data is returned only for objects that are less than the threshold. If a value is requested and the cache has a key but no data the data is pulled from the db

    $sql = 'SELECT m.meta_key, m.meta_id,m.object_type,m.object_id,length(m.meta_value) as meta_length, d.meta_value'
        . ' FROM '.WPSC_TABLE_META.' m '
        . ' LEFT JOIN ( '
        . ' SELECT d.meta_id, d.meta_value FROM '. WPSC_TABLE_META .' d '
        . ' WHERE d.object_type = \'' . $this->object_type . '\''
        . ' AND d.object_id = ' . $this->object_id
        . ' AND length(d.meta_value) < '. WPSC_META_CACHE_PREFETCH_SIZE . ' ) as d '
        . ' ON (d.meta_id = m.meta_id) '
        . ' WHERE m.object_type =\''. $this->object_type . '\' AND ' . 'm.object_id = ' . $this->object_id;

        $data_from_db = $wpdb->get_results( $sql, OBJECT_K );

        if ( !empty( $data_from_db ) ) {
            $this->data = $data_from_db;
            $this->update_cache();
        }

This is true for customer meta data. However I don't think autoloaded option makes WordPress startup slow. The autoload column is indexed so retrieval of the rows would take minimal amount of time. What it really affects is system memory since all of these options are cached.

My experience with this is very different. Bigger amounts of memory and cache is an easy way to cover things up, until you hit some limit. First thing I do when a site starts having performance problems is look at options table size, and how much autoload is happening. Sometimes is bad plugins, sometimes it is transients, sometimes other stuff. Most recently I had manually reduced the autoloads from 1.5MB to under 100K. Saved 1.2 seconds on page load time.

Also cache isn't the answer. It's great when you have enough iron to cache the world locally in shared memory. What about MEMCACHED implementations that can be local or remote. Do you really want to put 100KB or more on the wire on every page view? What happens when your cache sull flushes? Do you really want an app that only runs well when the cache is populated? What happens to performance after that reset, does your database start to bog down as the cache is rebuilt?

This is not a problem of WordPress. It is more an issue of server configuration. APC doesn't share cache between processes, this is a feature request. This has been a feature request for a long time, I am not betting on if it will ever happen. But there are more caches than APC to worry about.

Now, the issue with customer meta is strictly the use of transients, not the use of WP cache operations. When APC is enabled, customer meta is directly stored in persistent cache rather than going through the database layer. Since each process has a different persistent cache, this is not shared. I can see how that is a problem. However, if we have a separate table for customer meta, then this is no longer an issue, since we will only cache whatever is in the database. So all processes will share the same data, and the cache only reflects whatever there is in the database. I think we are on the same page. Caching is only a problem if the environment does not have a shared cache and the data is changed on one instance without a way to invalidate it on the other instance. Neither method works better than the other in a non-shared environment.

It's the use of transients that is the core issue. IMHO this is a more serious problem than is generally recognized, and we should make a meta data decision and use that decision to address the customer data issue.

JustinSainton commented 11 years ago

I hope the fear of additional custom tables doesn't come off as irrational. I get that they aren't inherently evil - and certainly get perhaps a worse reputation than they deserve, especially in the WordPress community. That said, I do think that they need to pass fairly strict requirements when being considered.

My hesitance is as follows:

Hopefully that helps to provide some rationale to my hesitancy. I do believe that, ultimately, whatever path we take is a step in a much better direction - so I'm all in for whatever path we collectively choose.

JeffPyeBrook commented 11 years ago

In case we decide to go the WP meta route.... Just realized that the WPSC_Meta class has almost everything done to do the migration once the target tables are created. All that is missing is a select unique object ids/object type to get the array list, feed that list into a for loop with wpsc_meta->get() followed by the wordpress set meta call. easy peasy.

JeffPyeBrook commented 11 years ago

Sorry to be my impatient self, but how do we go about coming to a resolution?

Do we go to the wp meta class now? If not now when? If not now, do we use the WPSC_Meta class in the meantime so that people will have an easier time migrating in the future?

JustinSainton commented 11 years ago

I think this should go in early to 3.9, once we begin that in earnest. I don't think there's a big rush on this until then - I'd love to get more community feedback regarding the available options between then and now.

On Tuesday, May 14, 2013, SparkleGear wrote:

Sorry to be my impatient self, but how do we go about coming to a resolution?

Do we go to the wp meta class now? If not now when? If not now, do we use the WPSC_Meta class in the meantime so that people will have an easier time migrating in the future?

— Reply to this email directly or view it on GitHubhttps://github.com/wp-e-commerce/WP-e-Commerce/issues/400#issuecomment-17913038 .

Thanks,

M. Justin Sainton Zao Web Design, LLC Principal / Project Manager http://www.zaowebdesign.com (c) 971.222.6330 (f) 1.413.723.0396

garyc40 commented 11 years ago

I don't necessarily get it when we're keeping/deprecating a currently existing table that does the same exact thing, and adding "X" amount of additional tables - all of which are identical

I also don't get it when we're adding code that's duplicating a functionality that is already existing in core, which is already well tested and guaranteed to work across platform.

I get all of the rationale - but when we have three years of history (and press) of what has been fairly unanimously considered as "progress" in reducing our custom table footprint from 30 tables down to 12 - adding another 4-6 tables seems like a historical step in the wrong direction, unless it's absolutely necessary.

Then I dare say that what we considered as "progress" was wrong, it was not progress just by counting the number of tables. We were correct by converting the products into posts, there's no doubt about that. But judging quality or progress just by reduction of table count is naive and impractical. I just can't see any correlation there.

We've been down the DB schema migration path before. 3.8 was an ~insanely~ bumpy ride, as any who were there remember.

I was there to put out the fire in subsequent 5 or 6 releases after 3.8 was out. It was a bumpy ride because it was an ambitious migration path. As you said we were reducing from 30 to 12 tables. That's bound to happen.

We also can't guarantee that the caching mechanism you want to introduce with our custom meta data scheme would be flawless. It's a trade-off and it's a natural evolution process. We shouldn't and couldn't maintain a separate fork of a functionality already implemented in WP core unless it's explicitly superior. And with our current wpsc_meta table, I can hardly say that it's any superior than having multiple meta tables. Or else why would WordPress store metadata for users and posts in separate tables?

Data migration is a one off thing. We do it, make sure it works for the majority of users out there, and continue to improve it. If it breaks, people can just revert back to an older version, as we're not wiping out the old table (unlike what we did in 3.8 migration).

Continuing down the path of an ineffective and inefficient method, however, accumulates technical debt. Arguing against it is like arguing against the move to custom post type when 3.8 was planned. It was an ugly process, but it is necessary. We had the guts to break compat with 3.7, so no matter how frustrating and bumpy the ride was, I had the utmost respect for Jeff and Valentinas, you and whoever else was making that hard decision. We're breaking compat with the new theme engine and the new merchant API. We can do it with custom meta.

I believe at least the customer meta should be implemented using a separate table and the meta API that WP core provides. Note that we only use that table for anonymous customers who are not logged in. Those that are logged in, their meta is stored instead in the usermeta table.

Any other types of meta that we're already using, we can work out at least a compromise if we can't reach a concensus, by having the WPSC_Meta API to deal with whatever we currently are using it for. But for new object types going forward (purchase logs, customer meta etc.) I still feel very strongly that those should use WP core API. At least with this compromise you won't have to worry about migration, and I won't have to worry about duplicate functionality and regressions.

garyc40 commented 11 years ago

@SparkleGear : I understand the concern about the customer meta issue. I'll work on a pull request later this week that creates a separate table for customer meta and use WP core API. Then we can compare notes :) If it works out well, it should be scheduled in one of the 3.8.xx releases as it does not break compatibility.

benhuson commented 11 years ago

Would a solution be to use the core WordPress API for meta as @garyc40 suggests. To me this seems like a sensible route to avoid duplicating WordPress functionality and bloating the core WPEC code, using an inherent WordPress method that is hopefully tried and tested?

Then if there is a need to store/cache some sorts of meta differently this could be created by hooking into the get_{$meta_type}_metadata type functions and do something different with it, rather than using the WordPress caching methods. Such functionality could be added to core if required, or add-on plugin if more suitable.

That way we'd be using WordPress API calls for meta throughout which should hopefully be easy for developers to get to grips with, and just implement custom functionality behind this if required.

I agree with Gary on tables. Yes there were 30 tables and moving to post types for products was the right thing to do and eliminated lots of requirements there. However tables are not bad and I agree that adding tables for specific types of meta or logging (such as purchase logs) is fine and makes sense for anything that is not web site content and is more for recording/storing data. Take a look at something like Gravity Forms - yes it doesn't use post types for forms but it does have it's own meta table for forms, a separate table for leads/enquiries (ie our purchase logs), a separate table for lead details (ie purchase log details/meta) and one for logging views.

JeffPyeBrook commented 11 years ago

@garyc40 wrote

I understand the concern about the customer meta issue. I'll work on a pull request later this week that creates a separate table for customer meta and use WP core API. Then we can compare notes :) If it works out well, it should be scheduled in one of the 3.8.xx releases as it does not break compatibility.

Sounds great. I'll wait to see how you handle the table creation and use it a template for creating the meta infrastructure for the other types. The cart_item_meta type can also be added 3.8.xx because it doesn't break compatibility.

Before we start one question about the API we ultimately want to have access custom meta. Do we want the WPEC meta API to mirror the four wordpress API calls so that are most familiar to people? If so the existing api calls should be wrappers around the (get|set|update|delete)_{type}_meta calls we create by wrapping the function called in meta.php.

From what I found these are the types we will need to support, are there others that I missed? _cart_item _customer _category _purchase_log _variation

I do have one suggestion to consider when setting up the custom table structure. If you create an index on the meta_key field combined with the first 20 or so characters of the meta value field. It can help performance. There are times when the meta value is tested as part of a query. Creating an index on the entire field may not be a great idea because there can be arbitrarily large values stored. I have tested this on my meta heavy site and it does help reduce locks and table scans, most notably when using WP_Query with the meta query option. I came to use 20 characters as the character length by looking at the data in my meta table, the overwhelming majority of the values that were not serialized were less than 20 characters in length

JeffPyeBrook commented 11 years ago

@garyc40 @JustinSainton I have some time today so I thought I would jump on this so @garyc40 can do more any of the million other things on his plate

I will create the custom meta access functions and the database creation routines, put them inside a test plugin and get it out to the community so people can evaluation the implementation, and perhaps even use the functionality until it finds its way into an official WPEC release.

Here are the initial design decisions I am going with, comments are welcome:

1) Meta functions will mirror the common wordpress post meta functions. Seems to make sense because these are the functions which people will be most familiar.

2) I am creating a little code generation function that can be manually triggered to create the php meta functions with comments. If we add more meta object types you can use the code generation function to create the meta functions for the object type.

3) The types initially created will be 'cart_item', 'customer', 'category', 'purchase_log', 'variation'

4) The value column of the meta table will be indexed on the first 32 characters. Should make searches by meta value a little quicker, and not cause table scans, avoid some row locking, etc. Meta values are generally short strings or numbers so I anticipate a high cardinality.

5) I am going to try to add an additional column to meta table. It will be an an auto updated timestamp. There will be one additional function to retrieve the timestamp. The purpose of this is to support using the custom object meta for object types that may expire, specifically the wpsc_customer_meta. If my testing, or some other issue raised by the community, shows this to be impractical this will be abandoned.

6) After 1-5 are done, I'll put together a conversion routine. I'll need some guidance on the accepted way to have the conversion routine run because I haven't done that with a WPEC release before.

Any thoughts -Jeff

JustinSainton commented 11 years ago

Hey Jeff,

Awesome, I think that sounds like a great start. For a head start on some of this (1 and 3, primarily) - there is a very decent taxonomy meta plugin that would be a great base (and provide a great primer on making this functional for multisite).

JeffPyeBrook commented 11 years ago

Just in case there was any remaining doubt, @garyc40 knows his stuff.

All meta types have been implemented, with database creation routines, and the upgrade routine. All the items in the prior post are done. Not counting the code that is automatically spit out by my code template for each meta type, there is about half the code to maintain. Performance is as good as or better than the dedicated meta table for typical (not blob) meta items.

With the code template it makes it easy for a developer to add new custom meta object. That was a feature of the custom coded meta handler that I thought would be the most useful going forward. It works even better in this version.

I'm working on the test plugin so everyone can play. So far the test plugin seems like a real waste of time because the wordpress meta functions just works. I guess that was gary's main point ;)

I'll get a pull request together for the custom meta functions. After everyone has a look at the functionality, we should think about which core functions we should migrate to use the new functionality and when it should happen. My take is that it's a lot simpler (less risky and very much less disruptive) than what we were all thinking previously.

JustinSainton commented 11 years ago

Sounds good @JeffSparkleGear! Looking forward to it.

JeffPyeBrook commented 11 years ago

If anyone wants to play while I finish testing you can grab the zip from the link below. It has the files from the pull request and a plugin test harness that will let you browse through the meta. Unzip the file in your plugins directory and activate it. After activation if you take a look at your database you will find all of the new meta tables with the data imported from the wpsc_meta table. The wpsc meta table remains intact. Plugin UI is very spartan, i'll clean it up as part of the testing if there is interest in keeping it as a test bed.

https://dl.dropboxusercontent.com/u/7214376/wpsc_custom_meta_tables.zip

benhuson commented 11 years ago

@JustinSainton Regarding that taxonomy meta plugin, am I right in thinking it stores term meta in a 'taxonomymeta' table based on a term id?

That would mean if you had a term that was used by 2 different taxonomies then it would store the same meta data for the term in both taxonomies? eg a blog category called shirts and a product category called shirts.

Surely it would be better to either store 'taxonomy' meta based on 'term_taxonomy_id' or store each taxonomy in a separate table against term_id?

Just curious as it seems like an oversight.

JustinSainton commented 11 years ago

Not sure - looking through the code there, it seems that though technically it uses parameter names like "term_id", it would be more correct to use term_taxonomy_id, as you mention. And indeed, the meta API in the plugin could be utilized in that way without any code changes.

But that's neither here nor there, as that particular implementation issue with them isn't particularly relevant to why I found it helpful for this conversation.

JeffPyeBrook commented 11 years ago

Sorry if I'm being dense here, what if any change are we suggesting to the custom object type meta code?

I do understand the term id vs term taxonomy id implications. But the wordpress meta functions are implemented such that they expect the object name (cart,variation,...) to have an identifier that append '_id' to the object type.

If what we are discussing is that we want to be sure that the term_taxonomy_id is used I am pretty sure we can do it with one of the filters that already is applied by wordpress the when meta is stored.

benhuson commented 11 years ago

@JeffSparkleGear Sorry to confuse. I don't think my comment is relevant to your implementation, it was just regarding the example Justin posted.