nicolasff / webdis

A Redis HTTP interface with JSON output
https://webd.is
BSD 2-Clause "Simplified" License
2.82k stars 307 forks source link

Inconsistent behavior between GET / POST #236

Closed VagyokC4 closed 10 months ago

VagyokC4 commented 1 year ago

Hi @nicolasff,

Hope all is well. I've been making progress on my project over here and I've started to run into some issues.

Sepcifically with inconsistencies with GET / POST and some of the args I'm trying to pass.

Issue 1

This GET works image

The same POST does not. image

I assume both of these requests should produce the same results?

Issue 2

While attempting to use this redis command with a GET: http://127.0.0.1:7379/JSON.SET/My.DataModel.Redis.TmDb.TmResult%3A01H0QTRN3VY3PBN5C1FJG4H75H/./%7B%22Id%22%3A%2201H0QTRN3VY3PBN5C1FJG4H75H%22%2C%22RiskScore%22%3A0%2C%22RiskScoreMin%22%3A0%2C%22TestScoreMin%22%3A0%2C%22TestScoreMax%22%3A0%7D

Postman has no issue, but inside my dotNet environment, something is interperting the './' as a root or something and removing it before the requests goes out so I get {"JSON.SET":[false,"ERR wrong number of arguments for 'JSON.SET' command"]}

However I can POST it no problem and get back "{\"JSON.SET\":[true,\"OK\"]}"

So here's my issue. Some commands only GET works, and others only POST works.

I don't really want to use different verbs based on the command, as the commands could change going forward and I really want this to be pass-through as much as possible, so I would prefer to use a single verb POST to handle all requests.

Hopefully this is an easy fix?

Thoughts?

VagyokC4 commented 11 months ago

Hi @nicolasff,

Any updates on this issue? I am looking to start implementing and this is still an unresolved issue for me.

nicolasff commented 11 months ago

Hi @VagyokC4,

Sorry about the delay, I thought I had responded to this. When you first opened this issue I spent some time adding extra debug logging to src/cmd.c where Webdis parses the command before sending it to Redis, and I couldn't reproduce what you were seeing.

I remembered from a previous issue your mention of .NET and your Postman screenshots, and I don't use either so I felt I was somewhat limited in my ability to send a request in the same conditions. So with your reminder today I installed Postman and tried to send a similar command to your FT.SEARCH. I looked carefully at the screenshot to make sure I'd use the same options, and this made me realize that something didn't look right in your POST example.

Looking at this part, this is what I noticed: Screenshot 2023-07-30 at 21 38 26

  1. You used "raw" as the option for the body, which is correct (as far as I understand)
  2. The part starting with /* looks different from the rest, almost as if it was… a comment. Like a /* comment */ in code, but missing the closing */.
  3. This third option is set to JSON, and the values I see available on my instance of Postman are Text, JavaScript, JSON, HTML, and XML. The UI doesn't explain exactly what this is for, but when I type hello/*world and then select JSON I see /*world as being "commented out", while this does not happen when the dropdown is set to Text.

This made me wonder if maybe Postman is considering the /* and whatever comes after it as a comment, and just drops it. This is easy enough to validate, and to do this I used curl since this is what I pretty much always use to send requests manually.

First, let's delete a key and add 5 values to it as a list:

$ curl -s http://localhost:7379/DEL/foo && echo
{"DEL":1}

$ for val in a b c d e; do curl -s "http://localhost:7379/RPUSH/foo/${val}" ; echo; done
{"RPUSH":1}
{"RPUSH":2}
{"RPUSH":3}
{"RPUSH":4}
{"RPUSH":5}

Ok, we should have a list with 5 keys, let's fetch elements [0 .. 3] with a GET:

$ curl -s http://localhost:7379/LRANGE/foo/0/3
{"LRANGE":["a","b","c","d"]}

and with a POST:

$ curl -s -X POST http://localhost:7379/ -d 'LRANGE/foo/0/3'
{"LRANGE":["a","b","c","d"]}

All good. Let's try this same POST with Postman and the same options as yours:

image

Notice how LRANGE is underlined in red, and a mouseover shows that the format is unexpected:

image

Now LRANGE doesn't take any parameters that would let us send a valid command containing a /*, but we can still try to send it and see how Redis responds. Let's send the Webdis command LRANGE/foo/0/3/*something, which translates to the Redis command LRANGE foo 0 3 *something since the slashes are just parameter separators for Webdis.

$ curl -s -X POST http://localhost:7379/ -d 'LRANGE/foo/0/3/*something'
{"LRANGE":[false,"ERR wrong number of arguments for 'lrange' command"]}

We get an error as expected, since LRANGE takes 3 parameters, not 4.

And now with Postman… (here I actually changed it from [0 .. 3] to [0 .. 2] to show that it's a different response compared to the screenshot above):

image

So this invalid command… actually returned data. As a last resort we can try the same with redis-cli:

$ redis-cli LRANGE foo 0 3 '*something'
(error) ERR wrong number of arguments for 'lrange' command

and of course we get an error, as expected: we did pass in the wrong number of arguments.

So this really explains it. When you select JSON in this drop-down, Postman will interpret the body and it's flexible enough with its JSON support that it even allows comments in there, and highlights them. If we change it to Text, this is the response:

image

Note that the red underline has disappeared, and the text starting with /* is no longer colored differently.

This is the problem with tools like Postman where you can't always be sure what it sends and what's being added or removed. In your case, it's silently trimming off part of your request body and the only way you're apparently supposed to tell is that the color is a bit different.

This behavior is documented, though. The section titled "Raw data" under "Sending body data" says:

For JSON raw body data, you can add comments, and they will be stripped out when the request is sent. Single-line comments delimited with // and multi-line comments delimited with / / will be removed in the request.

Note that the section above also mentions that a Content-Type header will be added:

If you use raw mode for your body data, Postman will set a header based on the type you select (such as text or json).

This is actually the wrong content-type for a payload like LRANGE/foo/0/2, which is not valid JSON (same for FT.SEARCH/… of course. In your case you should select Text instead of JSON from the drop-down, since you're not sending JSON data.

Alright, sorry that was a bit long but it reflects my steps to reproduce the issue and the gradual realization that this was what was causing it.

I hope this helps!

VagyokC4 commented 11 months ago

Hi @nicolasff,

Thank-you for the detailed response. I think the problem I'm having is that it's not just Postman where I get the error. While I use postman to "verify" different results, my application is in c#, so I'm using the dotNet tools available to reach out.

While I was waiting for a response I worked out a little work-around that appears to be working. (I'll know more as I get into implementing more, but tests show it working somewhat better)

public async Task<RedisReply> ExecuteAsync( string command, params string[] args )
{
    var postReq = $"{command}/{string.Join("/", args.Select(Uri.EscapeDataString))}";
    var getReq = $"{httpEndpoint}/{postReq}";

    var resJson = command.ContainsIgnoreCase("set")
                      ? await httpEndpoint.PostToUrlAsync(postReq).ConfigureAwait(false)
                      : await GetJsonFromUrlAsync(getReq).ConfigureAwait(false);

// Rest of the code
}

It seems my mileage varies based on if I'm doing a "SET" or something else. So when my command is a set, I POST, otherwise I GET.

Let me go through your response again and update as needed, but maybe we are focusing too much on Postman, which has it's own caveats. For a more 1:1 test, have you heard of LinqPad.NET? It's a free to use C# IDE that makes testing code super easy. If you download that app, I'll setup a script that shows the issues, then I can share it with you and you can run it on your side as well.

VagyokC4 commented 11 months ago

@nicolasff Here is the linqpad script URL... Once you have linqpad installed, it will consume this linq and download the script into the ide.

http://share.linqpad.net/co7tsv.linq

string WebdisEndpoint;
string RedisEndpoint;

async Task Main()
{
    // Hi Nicolas.  For this demo to work you will need the redis-stack-server image as your redis db, and then webdis wired up to proxy it.
    // docker run -d --name Redis-Stack-7-AOF-EDGE -v c:/redis-data/:/data/ --restart=unless-stopped -e REDIS_ARGS="--save --appendonly yes" -e REDISEARCH_ARGS="MAXSEARCHRESULTS -1 MAXAGGREGATERESULTS -1 MAXDOCTABLESIZE 3000000 TIMEOUT 0" -p 6379:6379 redis/redis-stack-server:edge
    // Having some issues with webdis correctly pointing at the instace created above.  Once it's correctly pointing at a redis-stack instance, this script should point out what I am seeing.

    // Setup Redis Search
    RedisEndpoint = "redis://localhost:6379";
    var redisOm = new RedisConnectionProvider(RedisEndpoint);
    redisOm.Connection.CreateIndex(typeof(Customer));
    var redisCollection = redisOm.RedisCollection<Customer>(false);

    // Setup Webdis
    WebdisEndpoint = "http://127.0.0.1:7379";

    // verify webdis
    var verifyWebdisEnpoints = GetWebdisCommand("PING").Dump();
    (await WebdisEndpoint.PostToUrlAsync(verifyWebdisEnpoints.PostBody)).Dump("Verify Webdis POST");
    (await verifyWebdisEnpoints.GetUrl.GetJsonFromUrlAsync()).Dump("Verify Webdis GET");

    //Set the randomizer seed if you wish to generate repeatable data sets.
    Randomizer.Seed = new Random(8675309);

    // Insert a record    
    await redisCollection.InsertAsync(new Customer
    {
        FirstName = new Bogus.DataSets.Name().FirstName(),
        LastName = new Bogus.DataSets.Name().LastName(),
        Age = new Bogus.Randomizer().Int(18, 88),
    });

    // looking at redis-cli monitor we can see it translates into something like:
    // "JSON.SET" "UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48" "." "{\"Id\":\"01H6S1JBG01BSF2X4AWQKJJZ48\",\"FirstName\":\"Nigel\",\"LastName\":\"Zboncak\",\"Age\":58}"
    var insertRecordWebdis = GetWebdisCommand("JSON.SET", "UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48", ".", "{\"Id\":\"01H6S1JBG01BSF2X4AWQKJJZ48\",\"FirstName\":\"Nigel\",\"LastName\":\"Zboncak\",\"Age\":58}").Dump("Webdis Command");

    // This one fails
    (await insertRecordWebdis.GetUrl.GetJsonFromUrlAsync()).Dump("Webdis GET -> Customers");

    // This one works
    (await WebdisEndpoint.PostJsonToUrlAsync(insertRecordWebdis.PostBody)).Dump("Webdis POST -> Customers");
}

public WebdisCommand GetWebdisCommand(string command, params string[] args)
{
    var getReq = $"{WebdisEndpoint}/{command}/{string.Join("/", args.Select(Uri.EscapeDataString))}".TrimEnd('/');
    var postReq = $"{command}/{string.Join("/", args)}".TrimEnd('/');
    return new (postReq, getReq);
}

public record WebdisCommand(string PostBody, string GetUrl);

[Document(StorageType = StorageType.Json)]
public class Customer
{
    [RedisIdField][Indexed] public Ulid Id { get; set; }
    [Indexed] public string FirstName { get; set; }
    [Indexed] public string LastName { get; set; }
    public string Email { get; set; }
    [Indexed(Sortable = true)] public int Age { get; set; }
    [Indexed] public string[] NickNames { get; set; }
}
nicolasff commented 11 months ago

Hey @VagyokC4,

I just wanted to say that I did see your last messages here, I am going to try LINQPad but it'll be another few days before I'm able to do this since it requires a Windows machine. All the computers I have here run macOS. I read that Xamarin Workbooks was usable as an alternative for Mac but it crashes soon after starting, which seems to be a common issue reported back in 2020. The project seems abandoned now, the repo is archived so I doubt it'll be fixed.

I do have Mono though, and as far as I know it's mostly compatible with .NET on Windows. For a command-line application I would expect it to have almost full compatibility. I could try that, if there's a way to convert this LINQ script into a .cs file that I can compile and run.

I see you used redis-cli monitor to track the commands being sent, that's a good idea and I should have used this for the other thread about Postman. From what I can tell of the repro script, it looks like you're first validating that GET and POST requests are even sent with the "verify" part, so I think this means that at least the to {Get,Post}JsonFromUrlAsync should work. It would be interesting to see what redis-cli monitor reports when you then get to the "customers" requests. Do you have the monitor output for these two requests? Just the requests of course and in what way they fail; the actual bogus user data doesn't really matter.

You wrote that "this one fails" but it's not clear to me what this means. Seeing what Redis received would go a long way in debugging this issue.

In the meantime and before I'm able to try LINQPad, I'll add some extra logging on a branch so that you can try a version of Webdis that describes step by step what it receives, what it does with it, what it sends to Redis, and receives back. This is definitely not a version that can be run under any kind of load as the heavy logging will likely slow it down significantly, in addition to making its log file grow quickly. But if you're able to run these two queries with this detailed output, it would likely help pinpoint the source of the issue.

VagyokC4 commented 11 months ago

Hi @nicolasff,

So the linq file can be opened in a text editor and you'll see its just a glorified .cs file with some project header information image

This will tell you what nuget packages to install, and what namespace usings to use.

Everything else is the code posted form above

string WebdisEndpoint;
string RedisEndpoint;

async Task Main()
{
  // Hi Nicolas.  For this demo to work you will need the redis-stack-server image as your redis db, and then webdis wired up to proxy it.
  // docker run -d --name Redis-Stack-7-AOF-EDGE -v c:/redis-data/:/data/ --restart=unless-stopped -e REDIS_ARGS="--save --appendonly yes" -e REDISEARCH_ARGS="MAXSEARCHRESULTS -1 MAXAGGREGATERESULTS -1 MAXDOCTABLESIZE 3000000 TIMEOUT 0" -p 6379:6379 redis/redis-stack-server:edge
  // Having some issues with webdis correctly pointing at the instace created above.  Once it's correctly pointing at a redis-stack instance, this script should point out what I am seeing.

  // Setup Redis Search
  RedisEndpoint = "redis://localhost:6379";
  var redisOm = new RedisConnectionProvider(RedisEndpoint);
  redisOm.Connection.CreateIndex(typeof(Customer));
  var redisCollection = redisOm.RedisCollection<Customer>(false);

  // Setup Webdis
  WebdisEndpoint = "http://127.0.0.1:7379";

  // verify webdis
  var verifyWebdisEnpoints = GetWebdisCommand("PING").Dump();
  (await WebdisEndpoint.PostToUrlAsync(verifyWebdisEnpoints.PostBody)).Dump("Verify Webdis POST");
  (await verifyWebdisEnpoints.GetUrl.GetJsonFromUrlAsync()).Dump("Verify Webdis GET");

  //Set the randomizer seed if you wish to generate repeatable data sets.
  Randomizer.Seed = new Random(8675309);

  // Insert a record    
  await redisCollection.InsertAsync(new Customer
  {
      FirstName = new Bogus.DataSets.Name().FirstName(),
      LastName = new Bogus.DataSets.Name().LastName(),
      Age = new Bogus.Randomizer().Int(18, 88),
  });

  // looking at redis-cli monitor we can see it translates into something like:
  // "JSON.SET" "UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48" "." "{\"Id\":\"01H6S1JBG01BSF2X4AWQKJJZ48\",\"FirstName\":\"Nigel\",\"LastName\":\"Zboncak\",\"Age\":58}"
  var insertRecordWebdis = GetWebdisCommand("JSON.SET", "UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48", ".", "{\"Id\":\"01H6S1JBG01BSF2X4AWQKJJZ48\",\"FirstName\":\"Nigel\",\"LastName\":\"Zboncak\",\"Age\":58}").Dump("Webdis Command");

  // This one fails
  (await insertRecordWebdis.GetUrl.GetJsonFromUrlAsync()).Dump("Webdis GET -> Customers");

  // This one works
  (await WebdisEndpoint.PostJsonToUrlAsync(insertRecordWebdis.PostBody)).Dump("Webdis POST -> Customers");
}

public WebdisCommand GetWebdisCommand(string command, params string[] args)
{
  var getReq = $"{WebdisEndpoint}/{command}/{string.Join("/", args.Select(Uri.EscapeDataString))}".TrimEnd('/');
  var postReq = $"{command}/{string.Join("/", args)}".TrimEnd('/');
  return new (postReq, getReq);
}

public record WebdisCommand(string PostBody, string GetUrl);

[Document(StorageType = StorageType.Json)]
public class Customer
{
  [RedisIdField][Indexed] public Ulid Id { get; set; }
  [Indexed] public string FirstName { get; set; }
  [Indexed] public string LastName { get; set; }
  public string Email { get; set; }
  [Indexed(Sortable = true)] public int Age { get; set; }
  [Indexed] public string[] NickNames { get; set; }
}

This is all dotNet7, so it should run on mono just fine.

My testing problem stems from the internal Redis packaged with webdis. 1) were you able to reproduce it on your side, and 2) are you able to create an image without the Redis bits? I think this will resolve my issing trying to target redis stack on my localhost.

VagyokC4 commented 11 months ago

Hi @nicolasff,

So the linq file can be opened in a text editor and you'll see its just a glorified .cs file with some project header information image

This will tell you what nuget packages to install, and what namespace usings to use.

Everything else is the code posted form above

string WebdisEndpoint;
string RedisEndpoint;

async Task Main()
{
    // Hi Nicolas.  For this demo to work you will need the redis-stack-server image as your redis db, and then webdis wired up to proxy it.
    // docker run -d --name Redis-Stack-7-AOF-EDGE -v c:/redis-data/:/data/ --restart=unless-stopped -e REDIS_ARGS="--save --appendonly yes" -e REDISEARCH_ARGS="MAXSEARCHRESULTS -1 MAXAGGREGATERESULTS -1 MAXDOCTABLESIZE 3000000 TIMEOUT 0" -p 6379:6379 redis/redis-stack-server:edge
    // Having some issues with webdis correctly pointing at the instace created above.  Once it's correctly pointing at a redis-stack instance, this script should point out what I am seeing.

    // Setup Redis Search
    RedisEndpoint = "redis://localhost:6379";
    var redisOm = new RedisConnectionProvider(RedisEndpoint);
    redisOm.Connection.CreateIndex(typeof(Customer));
    var redisCollection = redisOm.RedisCollection<Customer>(false);

    // Setup Webdis
    WebdisEndpoint = "http://127.0.0.1:7379";

    // verify webdis
    var verifyWebdisEnpoints = GetWebdisCommand("PING").Dump();
    (await WebdisEndpoint.PostToUrlAsync(verifyWebdisEnpoints.PostBody)).Dump("Verify Webdis POST");
    (await verifyWebdisEnpoints.GetUrl.GetJsonFromUrlAsync()).Dump("Verify Webdis GET");

    //Set the randomizer seed if you wish to generate repeatable data sets.
    Randomizer.Seed = new Random(8675309);

    // Insert a record    
    await redisCollection.InsertAsync(new Customer
    {
        FirstName = new Bogus.DataSets.Name().FirstName(),
        LastName = new Bogus.DataSets.Name().LastName(),
        Age = new Bogus.Randomizer().Int(18, 88),
    });

    // looking at redis-cli monitor we can see it translates into something like:
    // "JSON.SET" "UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48" "." "{\"Id\":\"01H6S1JBG01BSF2X4AWQKJJZ48\",\"FirstName\":\"Nigel\",\"LastName\":\"Zboncak\",\"Age\":58}"
    var insertRecordWebdis = GetWebdisCommand("JSON.SET", "UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48", ".", "{\"Id\":\"01H6S1JBG01BSF2X4AWQKJJZ48\",\"FirstName\":\"Nigel\",\"LastName\":\"Zboncak\",\"Age\":58}").Dump("Webdis Command");

    // This one fails
    (await insertRecordWebdis.GetUrl.GetJsonFromUrlAsync()).Dump("Webdis GET -> Customers");

    // This one works
    (await WebdisEndpoint.PostJsonToUrlAsync(insertRecordWebdis.PostBody)).Dump("Webdis POST -> Customers");
}

public WebdisCommand GetWebdisCommand(string command, params string[] args)
{
    var getReq = $"{WebdisEndpoint}/{command}/{string.Join("/", args.Select(Uri.EscapeDataString))}".TrimEnd('/');
    var postReq = $"{command}/{string.Join("/", args)}".TrimEnd('/');
    return new (postReq, getReq);
}

public record WebdisCommand(string PostBody, string GetUrl);

[Document(StorageType = StorageType.Json)]
public class Customer
{
    [RedisIdField][Indexed] public Ulid Id { get; set; }
    [Indexed] public string FirstName { get; set; }
    [Indexed] public string LastName { get; set; }
    public string Email { get; set; }
    [Indexed(Sortable = true)] public int Age { get; set; }
    [Indexed] public string[] NickNames { get; set; }
}

This is all dotNet7, so it should run on mono just fine.

My testing problem stems from the internal Redis packaged with webdis. 1) were you able to reproduce it on your side, and 2) are you able to create an image without the Redis bits? I think this will resolve my issing trying to target redis stack on my localhost.

ALSO: If you are not using linqPad, you can remove the .Dump("") method calls, as they just dump data to the screen in linqpad; which when setup correctly would give you this output. image

There is something with the way the GET and the POST behave differently with the same inputs.

nicolasff commented 10 months ago

Thanks for help, I'll try to integrate it as a .cs file. In the meantime if you want to try running with detailed logging, you can check out the extra-logging branch: https://github.com/nicolasff/webdis/tree/extra-logging

git clone https://github.com/nicolasff/webdis.git
cd webdis
git fetch origin
git checkout extra-logging

Make sure to change verbosity to 8 ("trace") in webdis.json, and then build the docker image from this checkout.

Here's what it looks like when Webdis parses the command /LRANGE/hello/0/-1 and responds with JSON. Before sending it to Redis it will log what it decoded from the command – here that it has 4 parts and the length and value of each one. This argv array is literally what is being passed to the Redis client.

When Redis responds, it will dump the raw Redis reply, and mention the same command pointer for cross-referencing.

D /LRANGE/hello/0/-1
T cmd=0x600000ae8a80 argc=4
T   cmd argv[0] (sz=6): val=[LRANGE]
T   cmd argv[1] (sz=5): val=[hello]
T   cmd argv[2] (sz=1): val=[0]
T   cmd argv[3] (sz=2): val=[-1]
T cmd=0x600000ae8a80 resp_raw_sz=22
T   response(raw) sz=22 val=[*2
$3
abc
$3
def
]

cmd=0x600000ae8a80 identifies the command object, this is so that we can relate a response to the parsed command. Here Webdis received a reply from Redis which was serialized as a 22-byte string. It looks shorter but that's because Redis uses \r\n (0x0d 0x0a) for new lines, so here it's *2\r\n$3\r\nabc\r\n$3\r\ndef\r\n.

If you can see exactly what Webdis understood of the request and what it received back from Redis, this should give us the answer.

nicolasff commented 10 months ago

Okay! It took a while but I think I figured it out. I downloaded Rider and created a new console app, importing most of your code. It wasn't too difficult, although things have changed a lot since I last used .NET 15 years ago 😄.

I ran Webdis with the extra logging enabled, and this was the output.

First, startup and the two PING commands:

[75079] 08 Aug 08:06:24 I Webdis listening on port 7379
[75079] 08 Aug 08:06:24 I Webdis 0.1.22-dev up and running
[75079] 08 Aug 08:08:22 D /
[75079] 08 Aug 08:08:22 T cmd=0x600003c34900 resp_raw_sz=7
[75079] 08 Aug 08:08:22 T   response(raw) sz=7 val=[+PONG
]
[75079] 08 Aug 08:08:26 D /PING
[75079] 08 Aug 08:08:26 T cmd=0x600003c30600 resp_raw_sz=7
[75079] 08 Aug 08:08:26 T   response(raw) sz=7 val=[+PONG
]

then the JSON.SET via a GET request:

[75079] 08 Aug 08:08:33 D /JSON.SET/UserQuery%2BCustomer%3A01H6S1JBG01BSF2X4AWQKJJZ48/%7B%22Id%22%3A%2201H6S1JBG01BSF2X4AWQKJJZ48%22%2C%22FirstName%22
[75079] 08 Aug 08:08:33 T cmd=0x600003c38180 argc=3
[75079] 08 Aug 08:08:33 T   cmd argv[0] (sz=8): val=[JSON.SET]
[75079] 08 Aug 08:08:33 T   cmd argv[1] (sz=45): val=[UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48]
[75079] 08 Aug 08:08:33 T   cmd argv[2] (sz=85): val=[{"Id":"01H6S1JBG01BSF2X4AWQKJJZ48","FirstName":"Nigel","LastName":"Zboncak","Age":58}]
[75079] 08 Aug 08:08:33 T cmd=0x600003c38180 resp_raw_sz=192
[75079] 08 Aug 08:08:33 T   response(raw) sz=192 val=[-ERR unknown command 'JSON.SET', with args beginning with: 'UserQuery+Customer:01H6S1JBG01BSF2X4

and finally the same command via POST:

[75079] 08 Aug 08:08:37 D /
[75079] 08 Aug 08:08:37 T cmd=0x600003c1c000 argc=4
[75079] 08 Aug 08:08:37 T   cmd argv[0] (sz=8): val=[JSON.SET]
[75079] 08 Aug 08:08:37 T   cmd argv[1] (sz=45): val=[UserQuery Customer:01H6S1JBG01BSF2X4AWQKJJZ48]
[75079] 08 Aug 08:08:37 T   cmd argv[2] (sz=1): val=[.]
[75079] 08 Aug 08:08:37 T   cmd argv[3] (sz=85): val=[{"Id":"01H6S1JBG01BSF2X4AWQKJJZ48","FirstName":"Nigel","LastName":"Zboncak","Age":58}]
[75079] 08 Aug 08:08:37 T cmd=0x600003c1c000 resp_raw_sz=192
[75079] 08 Aug 08:08:37 T   response(raw) sz=192 val=[-ERR unknown command 'JSON.SET', with args beginning with: 'UserQuery Customer:01H6S1JBG01BSF2X4

The fact that it's an unknown command doesn't really matter here, it's clear from the output that the two commands are different anyway.

The first thing I noticed was the + in UserQuery+Customer, it's not present in the POST request where it's a space instead. The second – and really the major issue here – is that the GET command was parsed into 3 parts while the POST command was parsed into 4. Now that's guaranteed to mess things up.

Looking at it closer, we can see that it's the . that's missing. Instead of /JSON.SET/UserQuery-etc/./some-json, it's like Webdis thinks it was /JSON.SET/UserQuery-etc/some-json. But was it?

We do have the full GET request on the first line of the GET logs:

[75079] 08 Aug 08:08:33 D /JSON.SET/UserQuery%2BCustomer%3A01H6S1JBG01BSF2X4AWQKJJZ48/%7B%22Id%22%3A%2201H6S1JBG01BSF2X4AWQKJJZ48%22%2C%22FirstName%22

Decoding it so that it's more readable, it's equivalent to:

'/JSON.SET/UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48/{"Id":"01H6S1JBG01BSF2X4AWQKJJZ48","FirstName"'

Don't worry about the fact that it's cut off, this is Webdis reaching the upper limit of its log messages. But look at what's after the customer ULID! → QKJJZ48/{"Id":01… no dot??

I added some Console.WriteLine in GetWebdisCommand, thinking that maybe Uri.EscapeDataString was stripping it somehow. It showed me that this wasn't the issue, the /./ was preserved.

So the request looks right in the code, but once it makes it to Webdis, Webdis doesn't see the dot. It's as if the request was transformed somehow, before it reaches Webdis. And you can see how in some cases this would make sense: if you were traversing a directory tree for example, it would be legitimate to remove the /./ and change them to a simple /. What's the point of saying "go to the same directory you're in", after all?

But these aren't paths, and this isn't a file system. So transforming the request this way is definitely the wrong thing to do.

To prove that this was what was happening, I set a single key foo to the value bar and then tried to read it in two different ways. First with a /GET/foo, and then with a /GET/././././foo. The latter is clearly incorrect, and Webdis will return an error. We can tell how it will process it by first trying it in redis-cli:

$ redis-cli GET . . . . foo
(error) ERR wrong number of arguments for 'get' command

Testing with curl I also needed --path-as-is to stop it from normalizing the URL, leading to the expected error message:

$ curl -s --path-as-is  http://127.0.0.1:7379/GET/././././foo
{"GET":[false,"ERR wrong number of arguments for 'get' command"]}

On the Webdis side, I observed the command be split into 6 parts, as expected:

[75079] 08 Aug 08:49:05 D /GET/././././foo
[75079] 08 Aug 08:49:05 T cmd=0x600003c38100 argc=6
[75079] 08 Aug 08:49:05 T   cmd argv[0] (sz=3): val=[GET]
[75079] 08 Aug 08:49:05 T   cmd argv[1] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[2] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[3] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[4] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[5] (sz=3): val=[foo]
[75079] 08 Aug 08:49:05 T cmd=0x600003c38100 resp_raw_sz=50
[75079] 08 Aug 08:49:05 T   response(raw) sz=50 val=[-ERR wrong number of arguments for 'get' command

So finally I tried the same thing in C#, adding:

            var simpleGetUrl = $"{WebdisEndpoint}/GET/foo";
            var simpleGetResponse = await simpleGetUrl.GetJsonFromUrlAsync();
            Console.WriteLine($"'GET foo' returned: {simpleGetResponse}");

            var getWithDotsUrl = $"{WebdisEndpoint}/GET/././././foo";
            var getWithDotsResponse = await getWithDotsUrl.GetJsonFromUrlAsync();
            Console.WriteLine($"'GET . . . . foo' returned: {getWithDotsResponse}");

which logged:

'GET foo' returned: {"GET":"bar"}
'GET . . . . foo' returned: {"GET":"bar"}

So that's your issue. The URL is being "normalized" and the /./ is removed. There's also a discrepancy with the + that you'll need to address.

VagyokC4 commented 10 months ago

Hi @nicolasff. Yes! Rider is the way to go for this :)

I added some Console.WriteLine in GetWebdisCommand, thinking that maybe Uri.EscapeDataString was stripping it somehow. It showed me that this wasn't the issue, the /./ was preserved.

Yes, after more testing I found an issue with this method as well, specifically with the + in the name. Here is the updated method that preserves the text properly.

public WebdisCommand GetWebdisCommand(string command, params string[] args)
{
    var postBody = $"{command}/{string.Join("/", args.Select(Uri.EscapeDataString))}".TrimEnd('/');
    var getUrl   = $"{WebdisEndpoint}/{postBody}";
    return new WebdisCommand(postBody, getUrl);
}

The URL is being "normalized" and the /./ is removed.

Awesome! Yeah, that sounds about right. Glad we were able to identify it. When you push this fix, are you able to introduce a nicolas/webdis-server image that only contains webdis? I'm curious if that fixes my issues with testing locally on localhost.

Also, I don't know if you want to incorporate this in the image, but if you remember the timeout issues I reported earlier, the fix was to put in a healthcheck that basically ran the /PING endppoint. This keeps the redis instances from shutting down and can report back when there is an issue as a healthcheck.

#!/bin/bash
echo -n "Performing Health Check:"
# Make a GET request to the /PING route of your container
response=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:7379/PING)

# Check the response status code
if [ "$response" = "200" ]; then
  echo " OK"
  exit 0  # Success
else
  echo " FAIL"
  exit 1  # Failure
fi

which I incorporated this way

FROM nicolas/webdis

RUN apk update && apk add curl

COPY healthcheck.sh /usr/src/app/healthcheck.sh
RUN chmod +x /usr/src/app/healthcheck.sh
HEALTHCHECK --interval=1m --timeout=3s CMD sh /usr/src/app/healthcheck.sh || exit 1
nicolasff commented 10 months ago

Glad we were able to identify it. When you push this fix,

Sorry if this wasn't clear, but there's nothing to fix on the Webdis side. You saw above what Webdis received using your own code, which was a request with the /./ stripped out even though it was originally present in the string you had prepared:

'/JSON.SET/UserQuery+Customer:01H6S1JBG01BSF2X4AWQKJJZ48/{"Id":"01H6S1JBG01BSF2X4AWQKJJZ48","FirstName"'

I also showed further down that Webdis is perfectly capable of processing a request containing /././ without ever normalizing it or changing it in any way. This is not a feature of Webdis. Webdis does not normalize URLs. Again, see above how it processed a request to /GET/././././foo:

I observed the command be split into 6 parts, as expected:

[75079] 08 Aug 08:49:05 D /GET/././././foo
[75079] 08 Aug 08:49:05 T cmd=0x600003c38100 argc=6
[75079] 08 Aug 08:49:05 T   cmd argv[0] (sz=3): val=[GET]
[75079] 08 Aug 08:49:05 T   cmd argv[1] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[2] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[3] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[4] (sz=1): val=[.]
[75079] 08 Aug 08:49:05 T   cmd argv[5] (sz=3): val=[foo]

It received /GET/././././foo and processed it as 6 parts. When it receives your request, the /./ is already missing.

The issue is in the HTTP client you are using, which is transforming what you are giving to it – presumably to be "helpful" – and that's how you end up sending a malformed request to Webdis.


As for the healthcheck, I think it's a good idea but it doesn't really make sense for the embedded Redis.

This keeps the redis instances from shutting down The Redis process is not going to stop on its own. What happened with your setup was the connection to a remote Redis instance was closed from the Redis side, as the error message indicated: Error disconnecting: connection reset by peer The "peer" in this case is Redis.

I had talked to a past contributor about sending a patch to add keep_alive back in May, I'll ask him for an update.

As for this point:

are you able to introduce a nicolas/webdis-server image that only contains webdis? I'm curious if that fixes my issues with testing locally on localhost.

It takes quite a while to publish an image, due to the complex build process to generate both ARM and x86 versions and sign each one, the process is well over 20 steps including at least 4 different passwords. If you want to test an image without Redis, it's easy to build one locally for yourself from the repo's Dockerfile. You can chose to not embed Redis, or leave it in and not run it, it's not like it will save much anyway since the image is already only ~8 MB or so.

VagyokC4 commented 10 months ago

The issue is in the HTTP client you are using, which is transforming what you are giving to it – presumably to be "helpful" – and that's how you end up sending a malformed request to Webdis.

Ahh I C. I thought it was your side normalizing the url and stripping the initial ./ Let me look into this side some more. Thanks again for all your help.

nicolasff commented 10 months ago

Okay, I think I have a solution for you. Please read carefully.

There are many threads online about how to prevent the .NET URI parser from simplifying URIs in this manner. This is sometimes referred to as "normalizing", sometimes "canonicalizing". It is done by the UriParser class in .NET, based on behavior flags provided to it in its constructor.

If you go to System.UriSyntaxFlags, you'll find a list of these flags. They are "OR'd" together as binary flags, and separate instances of UriParser are available by default for various protocols. If you go to the code for System.UriParser, you'll find them:

    public abstract partial class UriParser
    {
        internal static readonly UriParser HttpUri = new BuiltInUriParser("http", 80, HttpSyntaxFlags);
        internal static readonly UriParser HttpsUri = new BuiltInUriParser("https", 443, HttpUri._flags);
        internal static readonly UriParser WsUri = new BuiltInUriParser("ws", 80, HttpSyntaxFlags);
...

There are many more. Note how HttpSyntaxFlags is passed to the "http" parser. It is defined as:

        private const UriSyntaxFlags HttpSyntaxFlags =
                                        // [...] truncated for brevity, there are many more values
                                        UriSyntaxFlags.ConvertPathSlashes |
                                        UriSyntaxFlags.CompressPath |
                                        UriSyntaxFlags.CanonicalizeAsFilePath |
                                        UriSyntaxFlags.AllowIdn |
                                        UriSyntaxFlags.AllowIriParsing;

These flags – and especially CompressPath – control the behavior of the UriParser instance for HTTP URIs. They are themselves part of an enum, once again we can look it up and find the values (truncated here for brevity):

  internal enum UriSyntaxFlags
  {
    None = 0,
    MustHaveAuthority = 1,
    OptionalAuthority = 2,
// ... many more ...
    CompressPath = 8388608, // 0x00800000
    CanonicalizeAsFilePath = 16777216, // 0x01000000
    UnEscapeDotsAndSlashes = 33554432, // 0x02000000
// ... again some more ...
  }

All of this is private and not externally configurable, except… if we use reflection. I wrote that there are lots of threads about it online, and they all bring up some version of this technique. For example:

The problem with all of these is that they are only valid for old versions of .NET, and they no longer work. None of them do. Look at the dates: 2010, 2011, 2012… this is code that modifies the internal fields of UriParser by name, so if they were renamed at some point or if the flags were changed, it's not going to work. And that's exactly what happened over time.

What we need to do is this:

  1. For both "http" and "https", get the dedicated UriParser object
  2. Get a reference to the _flags field of the UriParser class (it's not longer m_Flags like in 2010)
  3. Extract the value for that field from the UriParser object we got above
  4. Convert it to an int
  5. Unset whatever flags we don't want by AND'ing them with NOT(the value) (see here for details)
  6. Saving the new value to the _flags field of our current UriParser object

I tried doing all this and it seems to work well. I do see requests being sent to Webdis with all their /./././ still present when I first make these changes.

Here's a screenshot of my Rider debugger showing the structure or uriParser (the UriParser instance for `"http" in this case):

Screenshot 2023-08-20 at 19 56 16

There are more values than fit on the screen, this is the full list:

MustHaveAuthority | MayHaveUserInfo | MayHavePort | MayHavePath | MayHaveQuery | MayHaveFragment
 | AllowUncHost | AllowDnsHost | AllowIPv4Host | AllowIPv6Host | AllowAnInternetHost | SimpleUserSyntax
 | BuiltInSyntax | PathIsRooted | ConvertPathSlashes | CompressPath | CanonicalizeAsFilePath | AllowIdn
 | AllowIriParsing

Here is the method to make these changes, call it at the start of your program:

    private static void DoNotSimplifyUris()
    {
        const int CanonicalizeAsFilePath = 0x01000000;
        const int ConvertPathSlashes = 0x00400000;
        const int CompressPath = 0x00800000; // only this one seems strictly necessary for the /././…  conversion.

        var getSyntaxMethod = typeof (UriParser).GetMethod("GetSyntax", BindingFlags.Static | BindingFlags.NonPublic);
        if (getSyntaxMethod == null)
        {
            throw new MissingMethodException("UriParser", "GetSyntax");
        }

        foreach (var scheme in new[] { "http", "https" })
        {
            // call with "http" and "https" to update both UriParser objects (see UriParser class for all instances)
            var uriParser = getSyntaxMethod.Invoke(null, new object[] { scheme });
            if (uriParser == null)
            {
                throw new ArgumentNullException($"Unexpected: UriParser.getSyntax({scheme}) returned null");
            }

            // get reference to UriParser._flags field
            var flagsFieldInfo = typeof(UriParser).GetField("_flags",
                BindingFlags.NonPublic | BindingFlags.GetField | BindingFlags.SetField | BindingFlags.Instance);
            if (flagsFieldInfo == null)
            {
                throw new MissingFieldException("UriParser", "_flags");
            }

            // get value of that field on the UriParser object we're looking at (either the http or https instance)
            var flagsValue = flagsFieldInfo.GetValue(uriParser);
            if (flagsValue == null)
            {
                throw new Exception($"Could not extract the value of UriParser._flags for the {scheme} instance");
            }

            // convert to the underlying int representation to unset some flags
            var flags = (int) flagsValue;
            flags &= ~CanonicalizeAsFilePath;
            flags &= ~ConvertPathSlashes;
            flags &= ~CompressPath;

            // save the modified value on the UriParser instance
            flagsFieldInfo.SetValue(uriParser, flags);
        }
    }

I've also written a test program that does this:

  1. Calls DoNotSimplifyUris()
  2. Sends a request to Webdis on /SET/foo/bar
  3. Sends a request to Webdis on /GET/foo/bar
  4. Sends a request to Webdis on /GET/././././foo – I see it unchanged in the Webdis logs
  5. Sends a request to Webdis on /GET/././././foo, but this time using a Uri object instead of a string – again, I see it unchanged in the Webdis logs

Here is the full program: https://gist.github.com/nicolasff/5de66eda4264a27637aebe2e767e4abf

This is its output:

'SET foo bar' returned: {"SET":[true,"OK"]}
'GET foo' returned: {"GET":"bar"}
'GET . . . . foo' returned: {"GET":[false,"ERR wrong number of arguments for 'get' command"]}
(manual) 'GET . . . . foo' (using Uri instead of String) returned: HTTP 200 OK
(manual) response body: {"GET":[false,"ERR wrong number of arguments for 'get' command"]}

and if I comment the initial call to DoNotSimplifyUris():

'SET foo bar' returned: {"SET":[true,"OK"]}
'GET foo' returned: {"GET":"bar"}
'GET . . . . foo' returned: {"GET":"bar"}
(manual) 'GET . . . . foo' (using Uri instead of String) returned: HTTP 200 OK
(manual) response body: {"GET":"bar"}

^ clearly the /GET/././././foo get simplified to /GET/foo.

It would be nice if Microsoft made this configurable, but as you can see it's been at least 12-13 years and they themselves have suggested this reflection hack on .NET support forums so it doesn't look like a cleaner solution is coming any time soon. Note that this is a .NET 7 version, I don't know if it would work with other versions and there's no guarantee it'll continue to work. I thought it was worth posting here since I couldn't find a version of it for recent releases of .NET.

Let me know what you think!

VagyokC4 commented 10 months ago

Let me know what you think!

Above and beyond bro! I was hitting dead ends because it was so embedded into the .NET, and as you pointed out .NET 7 is a different beast. I have tested this and it works well. Thank you!

nicolasff commented 10 months ago

Great! I'm glad it worked for you too, I was worried the internal field names would be different on .NET for macOS for whatever reason – as long as they're not public we can't really assume much.

It was interesting to have a quick look at modern .NET and see how much it has changed since I last used it seriously in 2006-2007. I had actually tried to do this around the time of my earlier message identifying the issue as coming from the HTTP client, and went down the wrong path flipping the bit for a different flag which made no difference for the /./ use case. Good thing I gave it a second try :-)

By the way, I learned that even my tool of choice for crafting HTTP requests – the ubiquitous curl – simplifies request URIs by default unless it's run with --path-as-is.

I think we can close this issue now, the subject has drifted quite a bit but as far as I can tell all the weird behaviors with Postman and the .NET HTTP client have been explained and workarounds were found for both.

VagyokC4 commented 10 months ago

Yup! Awesome. Thank you again for all your help with this. I'll keep you posted on my progress with using Webdis in our solution. So far it's been a heaven sent. Redis is nice with it's persistent connection, until the connection is unavailable and it brings down the whole app. Being able to use the speed of redis in a request/response scenario eliminates all of that.

nicolasff commented 10 months ago

Thanks for the kind words, that's great to hear :-)