joelcox / codeigniter-redis

A CodeIgniter library to interact with Redis
MIT License
240 stars 142 forks source link

Problem with spaces in strings #3

Closed ghost closed 12 years ago

ghost commented 12 years ago

I'm not able to use a function-call like $this->redis->hset('bucket', 'key', 'Value with spaces');

joelcox commented 12 years ago

Hi Lucas,

Sorry for your problem! I never store values with space, so I never ran into this problem. I've identified the bug and working on a fix now.

ghost commented 12 years ago

Thanks. I stumbled upon this while implementing a redis based job queue for CodeIgniter. I worked around the problem by replacing spaces with other characters. But still I thought to share the bug to improve your great work ;)

If you're interested in the job queue, I've made a gist of it here: https://gist.github.com/1482508

joelcox commented 12 years ago

Thanks Lucas. The job queue looks great.

joelcox commented 12 years ago

Small update. I'm planning on removing all PHP specific (like passing arrays) and library specific parts (like passing comma separated lists). Working out this bug while preserving these features is quite a hassle. I may add these back later, but as helper functions.

ssalihu commented 12 years ago

Great contribution. I encountered the same issue of having String with spaces as value. I am able to work around the problem for now.

tinkertim commented 12 years ago

I've just been encoding strings prior to saving them, which is a rather simple work around for now. A little hacky, but it's all tucked away in models where it's easy enough to yank out later.

This is a nice little class, it's easy to drop into just about darn near anything if you don't want / need everything that Predis gives you. I'll fork this later, I'm going to implement an actual multi / exec method (mile long lists of commands via overloading is a little wonky), I'll also have a closer look at the whitespace thing.

joelcox commented 12 years ago

Hey Tim, looking forward to your fork. I haven't done a lot work with PHP/CI + Redis lately, so that its hard for me to justify working on this with a gazillion other things screaming for attention (excuses, excuses, excuses ;-)).

tinkertim commented 12 years ago

@joelcox I'm going to spend a little time on it this weekend. I just ported the library to a mini framework I wrote for work to handle huge static sites that want modern features. I just realized, I forked it when I started using it in CI :) Are you going to push your pre whitespace changes to the main branch, or do you want patches against the dev branch?

joelcox commented 12 years ago

Please do you work on the develop branch. The master only contains stable releases :)

tinkertim commented 12 years ago

I'm working on this now, sorry for the delay. I'm hitting this in a few work projects, so I'm going to be spending most of today on the library.

joelcox commented 12 years ago

No problem at all. Any work is appreciated.

tinkertim commented 12 years ago

The problem here is definitely with _encode_request(), which explodes the request into an array by spaces. This results in each segment of what should be a single string being passed as an argument to the SET family of commands. There is no way to make _encode_request() reliably play nice with all of the various SET commands by simply detecting SET and chomping down the string differently. Consider HSET vs PSETEX, or any other SET, some want a TTL before the value.

I'm looking at possible ways to fix this without breaking the convenience of overloading, which is what makes the library small and attractive :)

One way is to just have _encode_request() take an optional second argument, $data and make concrete implementations for set() , hset(), psetx(), lset(), setnx() and others, as well as their multi variants (e.g.hmset() which would want an array and need a _multi_encode_request() implementation). Those methods could use the second argument so _encode_request() knows to treat it as one last argument, which is the value. There are only a hand full of methods that would have to be implemented, and these would basically just order their arguments and hand them off to the appropriate _encode_request().

This would leave the bulk of the command set up to the existing method overloading implementation, and not break existing installations, or require special quoting or magic characters to be sure values were tokenized properly.

Curious what others think before I sit down and actually do this. I don't see an easy and clean fix for this that wouldn't create problems for a few of the Redis commands, and I did try quite a bit of string fiddling in _encode_request() before resigning to just implement the methods that would be problematic via overloading.

joelcox commented 12 years ago

Adding an optional argument to _encoding_request() was also the approach that I took while looking into certain fixes and this seemed to be the most reasonable solution in my eyes, too.

tinkertim commented 12 years ago

The methods that will need to be implemented are:

I'm pretty sure that's all of the commands that would be fragile just using the magic __call() method, and those methods would only be a few lines long. I'll get to work on this and (and respective tests) and send a pull request once everything works.

If I forgot one, please poke me and let me know :)

joelcox commented 12 years ago

Yep, those seem right to me. Instead of declaring a method for each command, you might get away by adding an extra check to __call(), right?

if (in_array(strtolower($method), array('psetex', 'hset', etc.)
tinkertim commented 12 years ago

Well, psetex wants a TTL before the value (a few others too), so I'd need more than one array and search through both before ultimately passing it off to command(). Still, it would cut down on the number of actual methods that have to be defined. I'll give it a try and let you know.

joelcox commented 12 years ago

Just an FYI, I'm migrating the unit tests to PHPUnit.

tinkertim commented 12 years ago

Not a big deal, I'm using fakeIgniter to actually get this working before adding tests. It took less time to write that than bring up another CI app just to work on it :)

joelcox commented 12 years ago

Heya, I pushed the new unit tests to develop :)

See issue #3 for more

tinkertim commented 12 years ago

I've pretty much fixed _encode_request() , which now takes a second argument $data that can be either a string or an array. This works well for SET, MSET, HMSET and the like:

private function _encode_request($request, $data = NULL) {
    $slices = explode(' ', rtrim($request, ' '));
    $arguments = count($slices);
    if ($data !== NULL) {
        if (is_array($data)) {
            $arguments += (count($data) * 2);
        } else {
            $arguments ++;
        }
    }
    $request = "*" . $arguments . "\r\n";
    foreach ($slices as $slice) {
        $request .= "$" . strlen($slice) . "\r\n" . $slice ."\r\n";
    }
    if ($data !== NULL) {
        if (is_array($data)) {
            foreach ($data as $key => $value) {
                $request .= "$" . strlen($key) . "\r\n" . $key . "\r\n";
                $request .= "$" . strlen($value) . "\r\n" . $value . "\r\n";
            }
        } else {
            $request .= "$" . strlen($data) . "\r\n" . $data . "\r\n";
        }
    }
    return $request;
}

I trimmed the right side of the request because stray spaces result in an empty member of the $slices array when the request ends in a space and $data is set.

The trick here is tickling it appropriately depending on the command you want to issue. What I was thinking is rather than have __call() do a bunch of searches just to figure out how to order stuff, we just provide a secondary, or even ternary (optional) arguments to command(), have it check for them and then call _encode_request() appropriately.

So where once you did this:

   $this->redis->command('SET value Some really long string with lots of spaces')

You could just do this:

   $this->redis->command('SET value', $string)

Or perhaps this:

   $this->redis->command('MSET', $array);

Or even:

  $this->redis->command('HMSET myhash', $array);

While we could poke through a series of arrays that covers most of the 135+ commands that Redis has in order to automagically figure out how to talk to _encode_request(), I think most people would rather we just assume they know how to use Redis and provide one simple facility. If using the magic overload method doesn't work, call command() directly and use the optional arguments. I'm (personally) using it because I don't want the overhead of Predis when all I need to do are a few simple commands, without having my hand held.

If someone really wanted to go ahead and implement concrete methods for some of the more tricky calls, they could very easily do that.

That would basically change this to a minor bugfix instead of more drastic surgery, and we're not adding methods or smelly lookups on top of it :)

Let me know what you think and I'll finish it up.

joelcox commented 12 years ago

Heya, an extra argument seems the most reasonable to me and this would also allow for $this->redis->set('key', 'value'); if I'm not mistaken. Lean and mean is certainly the way to go :)

tinkertim commented 12 years ago

You are correct, the following works well to support the following scenarios:

$this->redis->mset($array);
$this->redis->hmset('hash_key', $array);
$this->redis->set('foo', 'string with lots of spaces');

Or the 'old school'

$this->redis->set('foo bar');

But, of course,

$this->redis->set('foo string with lots of spaces');

... is still going to give the old behavior, which can't be helped. People would just need to use the second argument in existing code, but it wouldn't break any existing code.

If you have more complex needs, just call command() directly, or implement a concrete method if you really need it.

This was actually pretty simple to do by a slight modification to command() and __call() (taken right from the copy I'm working on (actually ported away from CI for another project):

public function command($cmd, $value = NULL) {
    return $this->_write_request($this->_encode_request($cmd, $value));
}

public function __call($method, $arguments) {
    if (! isset($arguments[0])) {
        // flushdb etc
        return $this->command(strtoupper($method));
    }
    if (is_array($arguments[0])) {
       // mset, etc
        return $this->command(strtoupper($method), $arguments[0]);
    }
    if (isset($arguments[1])) {
        // hmset, etc
        return $this->command(strtoupper($method) . ' ' . $arguments[0], $arguments[1]);            
    }
    // old style 
    return $this->command(strtoupper($method) . ' ' . $arguments[0]);
}

I need to finish what I'm working on (now that I have redis behaving), I'll clean that up a little bit and push it this weekend. Or feel free to just grab it and commit it after reviewing.

joelcox commented 12 years ago

This was fixed in 0c523477632f95204780cf00cd9a5e4dcac139b0 :)