valkey-io / valkey

A flexible distributed key-value datastore that is optimized for caching and other realtime workloads.
https://valkey.io
Other
17.77k stars 676 forks source link

Initial version compatibility notes #43

Closed madolson closed 5 months ago

madolson commented 8 months ago

For our initial compatibility release, I think we want to have a very limited and well defined set of changes. I am proposing we launch a 7.2.4 version, which is clearly compatible with the OSS Redis version 7.2.4. We plan to launch the changes in two tranches RC1 and RC2.

Release candidate 1. 7.2.4-RC1

  1. We will build with the following binaries:
    1. valkey-server
    2. valkey-cli
    3. valkey-benchmark
    4. valkey-check-aof
    5. valkey-check-rdb
    6. valkey-sentinel
    7. For compatibility, we also create symlinks to these files with redis-*names in make install (for the installed binaries only), so valkey can be used as a drop-in replacement for Redis.
    8. :ballot_box_with_check: REDIS_FLAGS will be updated to SERVER_FLAGS. Maybe we should also allow REDIS_FLAGS -> SERVER_FLAGS as well, for an extra layer of compatibility. Done in #66.
  2. We will introduce 2 new info fields.
    1. valkey_version:7.2.4: This will indicate the valkey_version compatibility.
    2. redis_version:7.2.4: this isn't new, just called out that we will leave it for compatibility.
    3. server_name:valkey: This will provide a way for external clients to determine that this is valkey, if they want.
  3. In RDB format, we will add one new aux field and remove one aux field:
    1. valkey-ver:7.2.4: This will indicate that valkey produced the RDB and what version it was generated from. This is only used for compatibility and logging, so it should be safe to add.
    2. redis-ver:7.2.4: This field will be removed. Since it's only used for logging, it should be safe to remove this.
  4. Logging and other misc wording.
    1. In order to maximize compatibility, we will leave all logging unchanged, as we expect users may have scripts looking for specific log lines.
    2. We will remove the Redis logo from the startup and also include our website address (when we have one).
  5. We need the following LUA compatibility.
    1. We will create a new high level object that maps to the redis object, so that you can do server.call(), etc and other stuff as well as redis.call().
    2. We will update the top level lua version to include the valkey version, server name, while still keeping the redis version for compatibility.
  6. Module API compatibility
    1. Continue to fully support REDISMODULE_* for API compatibility
    2. Introduce a second VALKEYMODULE_* API that is find/replace compatible. This will be a secondary namespace, with the intention of all new APIS going here.
madolson commented 8 months ago

For some components, like check-aof, we might have a problem where they expect the name to look a certain way. We should probably rewrite them so that they are agnostic of what the name of the file is.

parthpatel commented 8 months ago

Couple of comments.

  1. We should consistently use placeholder vs placeholderkv across these.

    placeholderkv-server
    placeholderkv-cli
    placeholder-benchmark
    placeholder-check-aof
    placeholder-check-rdb
    placeholder-sentinel
  2. There are 3 flows that encounter "redis" names as per my understanding:
    a) end user facing naming - Clones repo, reads README, builds and runs the server/client Or installs via rpm. They may modify the flags in Makefile or use utils scripts. In Makefile, REDIS_FLAGS was the only reference. In utils directory, I see scripts like redis-sha1 or redis-copy etc which are searching for redis executable. Do we want to change them? b) developer facing naming - reads CONTRIBUTING file and opens a pull request OR writes a module to publish against the repo OR connects to Redis endpoint using one of the many supported clients (info and redis.call commands look appropriate changes, not so sure about RDB file changes). I would say this is a very wide scoped item. We should not touch this for 7.2 just yet. c) legal naming in license etc. - this is a given so not commenting on it.

  3. For RDB format change, are we sure that any custom parsers always know to skip unrecognized aux fields? Or do we want to simply break the compatibility there to remove trademark.

  4. I agree with logging remaining same. For Module and LUA scripting calls, how do we force folks to migrate to new calls in future? Also, which version would that happen in? On that note, what will be the changes in next major version assuming that we only do these changes right now?

  5. When I grep the codebase for redis, I see a lot of references in hiredis library. Is this client only internally consumed or do we want to rename it?

madolson commented 8 months ago

@placeholderkv/core-team Will you also comment here when you get a chance

hwware commented 8 months ago

I have 2 notes:

  1. If we could add new aux field in later version, so in RC1 we could keep the maximum compatibility with Redis 7.2
  2. some commands, such as LOLWUT, we need rewrite. Need to investigate more commands like this.
zuiderkwast commented 8 months ago

Very good. Agree on most of it.

Binaries: For the full compatibility release, it's best to keep the redis-server and redis-cli, at least as symlinks created by make. It seems to be fair use.

For INFO and RDB fields, let's call the new field server_name instead of server_type. Name and version go together.

For Lua, let's add server.call(), but let's also keep redis.call() indefinitely, for compatibility. It is a very important part of the command API. We don't want websites to break when the sysadmin switches to placeholderkv.

For logging and error messages, let's keep them for now but let's change them in the next major release. Let's assume they don't have to be identical to claim compatibility. They are not documented in detail.

daniel-house commented 8 months ago

It seems to me that when "redis" is part of a protocol/api, it is not being used as a trademark. When it appears in documentation or on a web-site refering to a product, then it is being used as a trademark. In this interpretation (note, I am not a lawyer) the response from HELLO or the names of module-api entry points like REDISMODULE_*, the contents and extension of rdb files, etc, are not trademarks and should not require changing. These are the things we are told we can use from 7.2.4 and are essential to being able to use 7.2.4. They are also the things that are required for drop-in compatibility. On the other hand, the names of the server or cli executables do look like trademarks. Except for statements like "the Redis protocol specifies that ...", nearly any mention of Redis in the documentation is probably a trademark violation. The Redis protocol has become a de-facto industry standard and it seems to me that RedisLabs has lost control of the "Redis protocol" trademark via common usage in the same way as the words "kleenex" and "vise grips".

PingXie commented 8 months ago

for 7.2.4 RC1

We will introduce 2 new info fields

I would suggest we move these two new info fields (server_version/server_name) to RC2, where we will close on the compat strategy (your strict/loose/none proposal). I am concerned about some client apps counting lines in the info output. let's be consistent with our compat story in RC1, which is no breaking change/same look-n-feel.

In RDB format, we will add two new aux fields

I am less concerned with these two fields but from the consistency's perspective, I think we can avoid one-offs and wait until RC2 to resolve all compat issues holistically. Then we just need to set a timeline to drop the compat, which hopefully will give enough time for 80+% of clients to add the native support. thoughts?

madolson commented 8 months ago

I think we can avoid one-offs and wait until RC2 to resolve all compat issues holistically.

This seems like a larger scope than we need. I think the original focus on compatibility just should be so we can remove all references to Redis. Additional work like deprecating master/slave, or other major changes, I think we should consider to the first major release.

PingXie commented 8 months ago

Let me be explicit and make sure we are on the same page

1.We will build with the following symlink that map to corresponding redis artifacts.

Agreed. Here is my understanding of the next level actions:

2.We will introduce 2 new info fields.

This is a NO for me

  1. In RDB format, we will add two new aux fields

Sure. I am OK with adding them in 7.2. Like I mentioned earlier, I am not concerned about compat with these two new aux fields. I see your point of making smaller changes along the way. Make sense to me.

4.Logging and other misc wording

Agreed with no change in all 7.2 releases

7.2 RC2

Agreed. Dual-system works for me (server.call() vs redis.call(), and RedisModuleAPI vs ValkeyModuleAPI)

hwware commented 8 months ago

Some active prs already invloves the redis keyword changes, I think we need to discuss if we need to replace all of these in RC1 as the following categories?

@valkey-io/core-team

PingXie commented 8 months ago

I think the following should be out of RC1. Others look great to me (file names will be symlinked)

hwware commented 8 months ago

Based on @PingXie words, I just draft the RC plan:

  1. RC1: Source code, comments, and any words except those on the "user face side" will be changed. "User face side" for example, the log, client message, and valkey(redis) config parameters will not be changed.

  2. RC2: Here are some items I suggest adding:

    • Server_version:7.2.4 and server_name:valkey are two new information fields
    • Deprecate master/slave, use primary/replica, and update all related information
    • (To be discussed) Whether to update the LOG message or not
    • The User client message should be updated or added with some fields
  3. RC3:

    • Compatibility with LUA
    • Compatibility with module APIs
PingXie commented 8 months ago

I think the two RC3 items can be pulled into RC2 since we will be aliasing them.

All of your RC2 proposals though sound like a major release thing. For instance, switching to primary and replica in CLUSTER NODES will definitely break every cluster client out there. The two new fields (server_version and server_name) and client output updates (like returning a different string in the server field) are of median risk IMO. Logging could have impact too.

madolson commented 8 months ago

[adding two new info fields] is a NO for me

I need to understand why this is a blocker. The reason given (counting number of the lines) doesn't seem like a real use case. What is our versioning strategy moving forward?

PingXie commented 8 months ago

[adding two new info fields] is a NO for me

I need to understand why this is a blocker. The reason given (counting number of the lines) doesn't seem like a real use case.

I need to clarify my statement a bit. I meant to say not adding these two fields in RC1 as, depending on where we add them, it has the risk to confuse some clients (I don't have a concrete example). If the new fields are added to the end of the INFO output, I think that is a lot safer but then it is ideal when we get to theh first major release - see below

What is our versioning strategy moving forward?

The first major release would be the ideal place to swap these two Redis strings out. Customers who want the compat can choose to keep the Redis strings. Then at some point, we drop the compat support, and no one gets to see any Redis strings afterwards.

madolson commented 8 months ago

Oh, I see, you are concerned because people might be explicitly looking for the 2nd and 3rd line. Would it alleviate the concern if we placed them at the bottom of the server info field?

PingXie commented 8 months ago

Oh, I see, you are concerned because people might be explicitly looking for the 2nd and 3rd line. Would it alleviate the concern if we placed them at the bottom of the server info field?

Yes but that is not the ideal place for these two fields. Ideally, we would like to swap out the two Redis fields with the new fields. Once we place these two new fields at the end of the output, I worry that they will get stuck there.

zuiderkwast commented 8 months ago

I don't really see why we need different release candidates. The actual releases are more important.

I think we can do Lua server object as an alias to redis at any time. Separate PR though.

I think we can postpone valkeymodule.h indefinitely. Can we just keep redismodule.h with existing APIs until we have some new APIs we want to add?

The 7.2.4 is compatibility only. We can add alias but we can't break anything. This is based on an existing tag and stuff are backported from unstable. The next release (minor or major) is created from unstable. There are a lot of unreleased stuff in unstable that we got when forking. Perhaps we should consider another 7.x release with those things?

zuiderkwast commented 8 months ago

Pidfile.

Default is "/var/run/redis.pid". Keep or change?

In the config file, it's suggested to configure it to "/var/run/redis_6379.pid". This we can change to valkey_6379 right? It's just a template config file to use as inspiration.

daniel-house commented 8 months ago

I find the template config, .../src/redis.conf, to be the best (sometimes only) documentation for the configuration parameters. It is supposed to also agree with the default settings that you get when you don't specify a config file. Changing its name to valkey.conf is almost certain to break someone's deployment scripts.

madolson commented 8 months ago

I find the template config, .../src/redis.conf, to be the best (sometimes only) documentation for the configuration parameters. It is supposed to also agree with the default settings that you get when you don't specify a config file. Changing its name to valkey.conf is almost certain to break someone's deployment scripts.

This is the thing I'm actually struggling to believe is true. I agree it's the best documentation for redis, but I think people will be willing to transition to valkey.conf. One thing I can say, we do plan on doing release candidates, so if someone is using it, hopefully it will come up?

PingXie commented 8 months ago

Yeah I think we should get the feedback from the release candidate.

I agree with "redis.conf being the best documentation for the config". I see little/no operational value in redis.conf as I have not encountered a case where redis.conf is used as-is in production. On the other hand, I can imagine an admin/operator developing a workflow that depends on the pid file path, though I haven't seen one either.

daniel-house commented 8 months ago

Oh, I don't object to renaming it. In fact, I think it is disgusting to find the best documentation is in an example config file. I just fear losing such vital information or making it harder to find than it already is.

madolson commented 8 months ago

Based on offline conversations, updated the current RC1 plan with the following changes:

  1. server_version -> valkey_version in server info. Since we would like to advertise specific compatibility, we are making the version specific to valkey. servername will remain as an optional indicator, and other valkey compatible stores might choose to advertise something else.
  2. We dropped redis-ver from the API. This isn't related to API compatibility, but we didn't want to "fake" that valkey was creating an rdb from a Redis version.
  3. Renamed server-ver -> valkey_version in rdb info. Same as point one, we want to explicitly indicate this was created by a valkey server.
zuiderkwast commented 8 months ago

Even though we've decided not to change logging entries, we may want to make some exceptions to a few of them, apart from the logo? Like "Redis is starting", "Redis is shutting down, bye bye" etc. I feel we need to decide where draw the line.

The rest of them can be guarded by a compat switch that we add in the next major version.

madolson commented 5 months ago

I believe this is closed.