bwlewis / rredis

R client for Redis
http://illposed.net/
93 stars 25 forks source link

database bloat using rredis #36

Closed merl-dev closed 8 years ago

merl-dev commented 8 years ago

consider the following, set a key from 3 different clients: SET from: "abcd"

  1. from redis-cli 127.0.0.1:6379> debug object from:cli Value at:00007FE24DB730A0 refcount:1 encoding:embstr serializedlength:5 lru:6561621 lru_seconds_idle:39
  2. from redis-other (almost any other client Predis.php, Redis.jl, Ioredis.js...) 127.0.0.1:6379> debug object from:other Value at:00007FE24DB73120 refcount:1 encoding:embstr serializedlength:5 lru:6561636 lru_seconds_idle:31
  3. from redis-rredis 127.0.0.1:6379> debug object from:rredis Value at:00007FE24780D4C0 refcount:1 encoding:embstr serializedlength:35 lru:6561646 lru_seconds_idle:26

Is this 7x bloat necessary? And is there a workaround in the form of an option/setting?

merl-dev commented 8 years ago

OK, just went through the closed issues and found the problem. charToRaw()... but in the interest of performance, would it not be better to have raw data sent to redis in the first place without the translation, or is this an R thing? (I apologize for my ignorance on this subject, I generally don't use R). In addition, we have another line of code where we use str_pad() on the value be sent to Redis and this translation is not required, so it seems there is an inconsistency in the protocol.

bwlewis commented 8 years ago

sorry for the latency I've been camping in the woods.

as you discovered, the default behavior serializes R objects. this works great if you want to store an arbitrary R thing like a data frame or matrix in Redis, but I agree that this introduces overhead in the special cases of strings and integers.

rather than treat those special cases differently, the package makes the choice of treating everything the same way, but allowing users to use the raw arguments or explicitly using raw objects for efficiency.

maybe it's not the right choice, but it's a choice. i think arguments can be made either way.

best,

bryan