upstash / redis-js

HTTP based Redis Client for Serverless and Edge Functions
https://docs.upstash.com/redis
MIT License
666 stars 51 forks source link

Unicode encoding issues and sorted sets #400

Closed kldavis4 closed 9 months ago

kldavis4 commented 1 year ago

I'm using a sorted set to order some keys and running into issues that I think are similar / related to #299 and #120.

I am using encodeURIComponent / decodeURIComponent to prevent mangling of keys - something like:

zadd<string>(this.zKey, { score: 0, member: encodeURIComponent(key) })

Then I use zrange to get the ordered keys. The problem is that the encoding changes the lexical order of the keys. For example, if I put these keys in:

[ '', '\t', 'πŸ„', '12', '2', 'a', 'aπŸ„', 'b', 'd', 'e', 'f', 'ff', '~']

The correct lexical ordering should be:

[ '', '\t', '12', '2', 'a', 'aπŸ„', 'b', 'd', 'e', 'f', 'ff', '~', 'πŸ„']

But if I encode the keys, the order is:

[ '', '\t', 'πŸ„', '12', '2', 'a', 'aπŸ„', 'b', 'd', 'e', 'f', 'ff', '~']

If I don't encode the values, the keys get mangled:

[ '', '\t', '12', '2', 'a', 'a=\x04', 'b', 'd', 'e', 'f', 'ff', '~', '=\x04']

I've tried setting automaticDeserialization to false and that provides no benefit. I'm not clear on why this is even the responsibility of the library caller. It seems like this should be all handled between the library and the upstash redis api (ensuring that the value which the library caller passes to zadd is the value that gets sent to redis).

kldavis4 commented 1 year ago

Also one other point of clarification, the results above are using SRH in case there might be some difference between that and upstash redis

github-actions[bot] commented 9 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 30 days.

kldavis4 commented 9 months ago

Not stale

ogzhanolguncu commented 9 months ago
const redis = redisClient();
const zKey = ["", "\t", "πŸ„", "12", "2", "a", "aπŸ„", "b", "d", "e", "f", "ff", "~"];

zKey.map(
  async (key) =>
    await redis.zadd<string>("test1", {
      score: 0,
      member: encodeURIComponent(key),
    })
);

const res1 = await redis.zrange<string[]>("test1", 0, 20);
console.log({ res1: res1.map((key) => decodeURIComponent(key))}); 
// [ "", "\t", "πŸ„", "12", "2", "a", "aπŸ„", "b", "d", "e", "f", "ff", "~" ]

I've used our SDK and an instance from Upstash and everything worked fine. Also I did the same thing without encoding and it still worked out.

kldavis4 commented 9 months ago
const redis = redisClient();
const zKey = ["", "\t", "πŸ„", "12", "2", "a", "aπŸ„", "b", "d", "e", "f", "ff", "~"];

zKey.map(
  async (key) =>
    await redis.zadd<string>("test1", {
      score: 0,
      member: encodeURIComponent(key),
    })
);

const res1 = await redis.zrange<string[]>("test1", 0, 20);
console.log({ res1: res1.map((key) => decodeURIComponent(key))}); 
// [ "", "\t", "πŸ„", "12", "2", "a", "aπŸ„", "b", "d", "e", "f", "ff", "~" ]

I've used our SDK and an instance from Upstash and everything worked fine. Also I did the same thing without encoding and it still worked out.

Thanks @ogzhanolguncu for looking into this. ~This looks like it must be an issue with SRH~

kldavis4 commented 9 months ago
  const redis = new Redis({
    url: '',
    token: ''
  })
  const zKey = ["", "\t", "πŸ„", "12", "2", "a", "aπŸ„", "b", "d", "e", "f", "ff", "~"]
  zKey.map(
    async (key) =>
      await redis.zadd<string>("test1", {
        score: 0,
        member: key,
      })
  )
  const orderedKeys = zKey.map(pair => pair).sort()
  console.log('Expected', orderedKeys)
  const res1 = await redis.zrange<string[]>("test1", '-', '+', { byLex: true });
  console.log('Actual', res1.map((key) => key));

Output

    Expected [
      '',   '\t', '12',
      '2',  'a',  'aπŸ„',
      'b',  'd',  'e',
      'f',  'ff', '~',
      'πŸ„'
    ]
    Actual [
      '',   '\t',  12,
      'a',  'aπŸ„', 'b',
      'd',  'e',   'ff',
      'πŸ„'
    ]

So if I don't encode this almost works ^ - something is happening to the ~ character though

kldavis4 commented 9 months ago

Looks like there was a bug in the code that writes the keys. When I change it to:

      const redis = new Redis({
        url: '',
        token: ''
      })
      const theKey = 'test6' + Date.now()
      const zKey = ["", "\t", "πŸ„", "12", "2", "a", "aπŸ„", "b", "d", "e", "f", "ff", "~"]
      for (const key of zKey) {
        await redis.zadd<string>(theKey, {
          score: 0,
          member: key,
        })
      }
      const orderedKeys = zKey.map(pair => pair).sort()
      console.log('Expected', orderedKeys)
      const res1 = await redis.zrange<string[]>(theKey, '-', '+', { byLex: true });
      console.log('Actual', res1.map((key) => key));

I get consistent results. So it looks like not encoding the keys is the correct solution.

ogzhanolguncu commented 9 months ago

I don't know I get the same results either way πŸ˜„. Also please don't share your keys man or delete that db. Glad you solved it.

kldavis4 commented 9 months ago

@ogzhanolguncu any idea why the string numeric values are getting converted to numbers? for example '12' -> 12

ogzhanolguncu commented 9 months ago

This is the default deserializer we run if you don't pass anything parser. It's either that or server is doing that. Pass a custom deserializer and see if result changes. Then, lemme know if your problem persists.