XjSv / Diablo-3-API-PHP

Simple Diablo 3 Web API PHP Wrapper
GNU General Public License v2.0
14 stars 9 forks source link

Add Optional Cache Functionality #5

Closed XjSv closed 11 years ago

XjSv commented 12 years ago

Add option to cache results and trigger to refresh.

XjSv commented 11 years ago

Will not be implementing this.

XjSv commented 11 years ago

Opening this again because it was brought to my attention from Tom Green: https://bitbucket.org/BeingTomGreen, that it would make sense to have this functionality "Since we are limited to 10k or 50k " -- Tom Green. Sample code:

$file_time = 0;
if($this->use_cache) {
    // Check if we have a cached file
    //
    $url_md5 = md5($url);
    if(file_exists($_SERVER['DOCUMENT_ROOT'].$this->cache_loc.$url_md5)) {
        $file_time = file_get_contents($_SERVER['DOCUMENT_ROOT'].$this->cache_loc.$url_md5); // Override file_time
    } else {
        // Create the file
        //
        if(is_dir($_SERVER['DOCUMENT_ROOT'].$this->cache_loc) && is_writable($_SERVER['DOCUMENT_ROOT'].$this->cache_loc)) {
            $handle = fopen($_SERVER['DOCUMENT_ROOT'].$this->cache_loc.$url_md5, 'w');
            $time   = time();
            fwrite($handle, $time);
        } else {
            error_log("Cache Directory must be writtable");
            return false;
        }
    }
}
BeingTomGreen commented 11 years ago

I hope you don't mind my input, but your implementation seemed to be the furthest along so I based mine on this.

I looked at doing a file based cache but then decided to use Memcached instead, obviously it may be slightly overkill, but in high-traffic scenarios it should be faster.

Once I have done it I will be happy to add it to yours (perhaps have an option $cacheType so people can choose?) via a pull request if you want.

XjSv commented 11 years ago

Great idea I could use Memcached also. The thing with my implementation is that if we use IFMODSINCE we would need to store the URL requested time. Maybe the name cache for this is not 100% appropriate. Here I am really using the cache directory as a database for URL's and access times.

BeingTomGreen commented 11 years ago

Nothing to stop you adding both the access times/URLs and the json responses to Memcached? Or am I missing something?

I would store the last access times (although again probably in Memcached rather than on disk) then check the last mod before returning/refreshing the Memcached data.

Since I am still waiting on my API Key I will probably push ahead with Memcached tomorrow.

I don't know what the traffic/load is like for d3stats.tk (nice by the way) but if it is reasonable I would think Memcached would be much faster than Disk I/O.

XjSv commented 11 years ago

Yes I suppose if the URL has not been modified since the given time then you would return the cached data instead correct?

Also now that I think about it, I would like to keep the class as clean and simple as possible with few if none dependencies. I think I will go with disk cache.

Preformed a very quick search on Google and found this: http://surniaulula.com/2012/12/04/memcached-vs-disk-cache/

However, I will like to take a little time to think about it more.

BeingTomGreen commented 11 years ago

Yes exactly that.

I agree with keeping it clean.

I would agree local disk I/O is simple, but my concern is given the traffic levels I see my disks are going to get hammered. And Apache (which I am still using sadly) has issues with lots of files in a directory :(

I shall knock it up tomorrow and run some tests :)

XjSv commented 11 years ago

Hmm really? The only reason I can think of for Apache being slow when too many files is because of the AllowOverride All if you have it set to All.

One: Speed—the .htaccess page may slow down your server somewhat; for most servers this will probably be an imperceptible change. This is because of the location of the page: the .htaccess file affects the pages in its directory and all of the directories under it. Each time a page loads, the server scans its directory, and any above it until it reaches the highest directory or an .htaccess file. This process will occur as long as the AllowOverride allows the use of .htaccess files, whether or not the file the .htaccess files actually exists.

Source: https://www.digitalocean.com/community/articles/how-to-use-the-htaccess-file

Also you are correct. If you are using this class in a big project with high traffic, you are probably going to be looking into scaling in which case Memcached would all ready to go.

BeingTomGreen commented 11 years ago

Yeah AllowOverride can be a nightmare.

The issue I am talking about involves around 750k images all in one directory (legacy code I inherited). Apache really seems to struggle serving images from that directory.

Files are going to be fine for low traffic sites but you will (in my experience) start to see issues if you start having hundreds of concurrent users all trying to read/write. I get that in this case its pretty unlikely, but still.

I would rather over-engineer something now than have to refactor it later, especially as there isn't any real disadvantage to using Memcached from the off and it doesn't involved any extra work.

XjSv commented 11 years ago

Wow 750k images in 1 directory is just insane. I will be following your code changes on Bitbucket to see how you implement memcached.

BeingTomGreen commented 11 years ago

It is a little ridiculous, I am currently trying to work out the best way to fix it.

I have thought a lot about what you said, I think keeping caching out of the main class would be best. Even file cache adds a lot of extra code which should probably be handled outside of the class.

But given that I am using this code on a live site I will still want to use Memcache(d) so I will probably just add an example once I get it working.

BeingTomGreen commented 11 years ago

I have added Memcache to my implmenation.

I won't say I am 100% happy with it:

-The Memcache functions (rather than the Memcached) don't seem to have a reliable way of testing if you are actually connected to any server of the servers when using Pools. The 'recommended' way around this seems to be just to suppress errors, but that can cause huge load times if there is a poor connection. -There is currently no way to force the refreshing of cached data.

Sadly I don't have a deployed server running Memcached, but I assume changing $this->memcache = new Memcache; to $this->memcache = new Memcached; would do the trick. Although there would be few advantages since I don't make use of any Memcached specific calls.

XjSv commented 11 years ago

After some thinking I have decided that I will not implement Memcached. It seems to messy and it adds a dependency I don't want even though its optional. And as you mentioned I think results should be cached outside of the main class. Although I will continue to implement the IFMODSINCE approach.

BeingTomGreen commented 11 years ago

It's less code than your file method, and to my eyes, tidier too, but I do get your extra dependency point.

I still may drop it and add it outside the class as I do think it would make it tidier.

BeingTomGreen commented 11 years ago

I agreed.

Here is the separate Memcache example, again it should be pretty easy to swap out Memcache for Memcahced.

I have't really put much effort into checking we have a Memcached connection or that we have the required extensions since it is only a basic example.