Closed joelcox closed 11 years ago
The best fix here is to just implement the list related commands. It could be done by storing an array of commands that tells us when we need to treat input differently, but that's .. fragile.
Correct. Another idea I had was checking whether the keys in the arrayare numerical sequential or not, but that isn't robust either.
Well, we're going to have the same problem with sets and sorted sets, as well as pubsub. I'm looking at _encode_request()
again now, and it would not be hard to treat certain methods differently. There's about ~15 or so that would need this treatment.
So adding an array with these edge cases would be the way to go, I guess. I'll take a look at this on monday and add a bunch of unit tests for specific commands. Thanks for your input.
I should have a chance to do it tomorrow or Monday if you don't beat me to it.
Was this ever resolved? I'm wanting to use redis with CI and this looks like a nice clean implementation overall.
Hey Joseph, @tinkertim has a fix in the works, but I'm not sure if he made any progress due to the holidays.
I should have this pushed in the next few days. Sorry about the delays, my job ended just prior to the holidays, and then there were the holidays, so things have been rather frantic.
No problem Tim!
I think the changes I made in my pull request (Issue #20) have fixed the second half of this issue - that is, the list range commands being parsed incorrectly. That should leave just the first half, where indexes need to be stripped in certain cases. That sound right?
That is correct Dan. I just pushed some new unit tests to confirm this.
On Jan 3, 2013, at 4:56 PM, Daniel Hunsaker notifications@github.com wrote:
I think the changes I made in my pull request (Issue #20) have fixed the second half of this issue - that is, the list range commands being parsed incorrectly. That should leave just the first half, where indexes need to be stripped in certain cases. That sound right?
— Reply to this email directly or view it on GitHub.
Is there a decent sanity check we can use to determine when to strip indexes (as opposed to passing keys through)? An approach that doesn't require a list of commands that need stripping performed?
(Looking at _encode_request()
, it looks like passing these values directly as opposed to in an array would work around this issue in the meantime...)
One thing that was discussed is treating arrays which are indexed from 0..n as lists, while others are treated as dictionaries (associative arrays). This would be the only way to determine it from the passed in data structure. I'm just worried that people shoot themselves in the foot if we implement it this way.
On Jan 3, 2013, at 10:20 PM, Daniel Hunsaker notifications@github.com wrote:
(Looking at _encode_request(), it looks like passing these values directly as opposed to in an array would work around this issue in the meantime...)
— Reply to this email directly or view it on GitHub.
I see essentially four options, then. Another would be to do nothing, but I'm sure that has already been ruled out, so I'll ignore that one. :smile_cat:
$config['redis_dictionaries'] = false;
$config['redis_dictionaries'] = true;
$config['redis_dictionaries'] = array('hmset', 'mset', 'msetnx', ...);
Thoughts?
I prefer to keep the configurable options down, but I can see its value in this situation. One thing I'm wondering is how many people would actually pass in a numeric (0..n) indexed array into a Redis hash data structure. You could argue that 99% percent are actually using a wrong data structure for their data. What are valid use cases for this?
Another (crazy) option would be to introduce two new classes to represent the different data structures (RedisHash
and RedisList
, which we can inspect when they are passed in), but that wouldn't be viable, IMHO.
I can't say that I'm aware of any, but I can do some poking around to see if I can find one...
Huh. Well, apparently there's at least one: http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs
I guess that wouldn't count. Only arrays which start at index 0 and have as many elements as the highest index - 1 should be treated as sets.
Joël Sent from my phone
On Jan 5, 2013, at 2:31 PM, Daniel Hunsaker notifications@github.com wrote:
Huh. Well, apparently there's at least one: http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs
— Reply to this email directly or view it on GitHub.
Hrm. So
$strip_keys = false;
reset($args);
if (key($args) == 0)
{
end($args);
if (key($args) == count($args) - 1)
{
$strip_keys = true;
}
}
which relies on the sort order being unchanged, or even
$strip_keys = false;
$keys = array_keys($args);
if (min($keys) == 0 && max($keys) == count($args) - 1)
{
$strip_keys = true;
}
which is sort-tolerant? I can see that. I was, in my mind, treating all numeric keys as indices, regardless of order or value distance. I'd still prefer that people use something like "_155315"
instead, though.
On the other hand, would it be more robust to check whether the keys were set as integers versus strings? As opposed to simply checking whether they evaluate numerically? Maybe something crazy like
$strip_keys = false;
if(is_int(key($args)))
{
$strip_keys = true;
}
or similar (maybe add a reset($args);
at the top...)? That ought to differentiate between "155315"
and 155315
, which would help weed out the intentional uses of numeric keys from the automatic assignment of numeric indices, and also have the benefit of being sort-tolerant.
If we're really paranoid about this, we could even combine approaches:
$strip_keys = false;
$keys = array_keys($args);
if (is_int(key($args)) && min($keys) == 0 && max($keys) == count($args) - 1)
{
$strip_keys = true;
}
Of course, I could just be over-thinking this. :smiley:
(Regarding that last code fragment, I realized after hitting Comment [I meant to hit Preview!] that
$strip_keys = false;
$keys = array_keys($args);
if (min($keys) === 0 && max($keys) === count($args) - 1)
{
$strip_keys = true;
}
would perform the same checks with one less condition to process...)
The last implementation would have my preference. Checking all the keys wouldn't really be feasible, so checking the first and last would be OK.
I suggest holding off on making a decision just now and wait for @tinkertim $0.02. (No rush Tim!)
I haven't heard from @tinkertim, so I'm proposing to go with solution 3 as suggested by @danhunsaker. If anyone disagrees, please speak up now, or forever hold your peace. :)
I'll let this sit for a few days before implementing.
As far as my $0.02, after stepping away from it for a couple of weeks, I like option 3 a lot, and would vote option 4 as a close second, even if only to keep option 3 around.
Hey everyone. I pushed the changes we discussed in this thread to the fix-18 branch and it now passes all the units tests again.
I ended up using @danhunsaker last suggestion to check whether we should consider something an associative array or not. I'm not too pleased with how the _encode_request
function looks right now and I haven't updated the README yet, but the code is functional.
Let me know if you find any edge cases so the test suite can be expanded if necessary.
Merged. Thanks all. 81d2a03993dced5a7a5692b0b101523fbbf107c4
There are a few inconsistencies when performing operation on lists:
Originally reported by Benjamin MA.