microsoft / dev-proxy

Dev Proxy is an API simulator that helps you effortlessly test your app beyond the happy path.
https://aka.ms/devproxy
MIT License
466 stars 55 forks source link

Add support for filtering URLs to watch by request header #740

Closed waldekmastykarz closed 2 months ago

waldekmastykarz commented 3 months ago

From @jimmywim

Extend Dev Proxy with the ability to filter URLs to watch by request header. This is needed to filter our irrelevant request in cases like SPFx where the app is a part of a page, that might be issuing similar requests, but which should be ignored. This feature applies also in cases where you want to watch only requests from a specific component that's a part of a larger app.

Extend devproxyrc.json with a new property named filterByHeaders. The property contains a collection of headers to filter by. If the value is specified, compare it with the header's value using a contains operator. If the value is empty, ignore the value and only check if the header is present. If multiple headers are specified in the config, Dev Proxy will check for the presence of all headers (and) on the request.

Headers specified in this property are applied after requests have been matched with urlsToWatch as an additional filter.

waldekmastykarz commented 3 months ago

@garrytrinder, @jimmywim, anything else we should consider?

garrytrinder commented 3 months ago

Sounds good.

What is the proposed structure of the filterByHeadersproperty? I assume that it will follow the same convention we use for response headers?

{
  "filterByHeaders": [
    {
      "name": "x-custom",
      "value": "myvalue",
    },
    {
      "name": "x-custom-app",
      "value": ""
    }
  ]
}
MChez commented 3 months ago

It makes good sense to keep to the same pattern. You may want to consider filtering on the presence of the header alone, i.e. filter when the header exists and you don't care what the value is.

garrytrinder commented 3 months ago

You may want to consider filtering on the presence of the header alone, i.e. filter when the header exists and you don't care what the value is.

I think idea is to only check for the presence of the header if the value property is empty.

waldekmastykarz commented 3 months ago

I'm on the fence if multiple headers should be matched with an and or an or. And or would give you theoretically more flexibility across different parts of your app. But I wonder if that's intuitive from just looking at the config. Thoughts?

waldekmastykarz commented 3 months ago

I was thinking about a dictionary but making it an array that follows response makes sense. It's consistent and as an array it's aligned with URLs to watch and allows you to specify the same header multiple times to match different values. Thanks folks!

waldekmastykarz commented 3 months ago

What do you think of filterByHeaders as the property name? Does it sound ok and clearly convey its purpose?

MChez commented 3 months ago

I've been trying to think this through but when I start to think about how to implement the filtering in code it starts to get more complicated.

As a starter, I think this should work together with --urls-to-watch, --watch-pids, and/or --watch-process-names as a second level filter. The header filter would produce a subset of calls returned by the existing filters.

filterByHeaders sounds good.

MChez commented 3 months ago

The following isn't complete. I run into trouble when I think about what is meant by the entries in the filterByHeaders array. First you would need to check that the header exists, then if the entry has a value, check the header has that value. It gets more complicated when you then have to combine those two checks with AND or OR. I'm posting this just to see if it sparks any ideas for others.

I think if you state that the list of headers in the new filter use AND, or OR, you will always be wrong in someone's view. But, if you try and make it too flexible, no one will understand it and testing it will become a nightmare.

Instead of making filterByHeaders an array, you could make it an object that contains an array. Something like the following will increase the flexibility of the filtering without going too mad:

{
  "filterByHeaders": {
    "operator": "and/or",
    "headers": [
      {
        "name": "x-custom",
        "value": "myvalue",
        "include": true/false 
      },
      {
        "name": "x-custom-app",
        "value": ""
      }
    ]
  }
}

filterByHeaders.operator would be optional and default to AND. filterByHeaders.headers[:n].include would be optional and default to true.

The simplest form would be this:

{
  "filterByHeaders": {
    "headers": [
      {
        "name": "x-custom",
        "value": "myvalue",
      },
      {
        "name": "x-custom-app",
        "value": ""
      }
    ]
  }
}

PS. I have a habit of making things too complicated so I've tried to balance functionality with simplicity, but I may have failed. For example, filterByHeaders.header[:n].include may be a step too far. I'm just throwing out some ideas to see if they help.

An alternative may be a second config option:

{
  "filterByHeadersOperator": "and/or",
  "filterByHeaders": [
    {
      "name": "x-custom",
      "value": "myvalue"
    },
    {
      "name": "x-custom-app",
      "value": ""
    }
  ]
}
waldekmastykarz commented 3 months ago

As a starter, I think this should work together with --urls-to-watch, --watch-pids, and/or --watch-process-names as a second level filter. The header filter would produce a subset of calls returned by the existing filters.

That's what we've been thinking too, thanks for confirming.

When I think about including configurable operators, I can't help but think about building good old CAML queries and nesting the different parts of the query. I think that it's too complex for what we need. I suggest we start with a simple or operator across all values. That way, you get the flexibility of specifying different headers for the different pieces of your app that you want to monitor without the configuration being too complex to setup and debug. The only scenario we don't support is specifying multiple headers that a request should have in order to be included, but I wonder if that's truly necessary.

waldekmastykarz commented 2 months ago

All right, I've got an alpha version available which you can grab from here: https://github.com/waldekmastykarz/dev-proxy/releases/tag/v0.0.0-beta.20 (it's a pre-release version so after installing start with devproxy-beta

Here's how you'd use it in a config:

{
  "$schema": "https://raw.githubusercontent.com/microsoft/dev-proxy/main/schemas/v0.18.0/rc.schema.json",
  "plugins": [
    {
      "name": "RetryAfterPlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/dev-proxy-plugins.dll"
    },
    {
      "name": "GenericRandomErrorPlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/dev-proxy-plugins.dll",
      "configSection": "genericRandomErrorPlugin"
    }
  ],
  "urlsToWatch": [
    "https://jsonplaceholder.typicode.com/*"
  ],
  "genericRandomErrorPlugin": {
    "errorsFile": "devproxy-errors.json"
  },
  "filterByHeaders": [
    // request must contain the x-app: contoso-intranet header
    {
      "name": "x-app",
      "value": "contoso-intranet"
    },
    // or the x-contoso header with any value
    {
      "name": "x-contoso",
      "value": ""
    }
  ],
  "rate": 50,
  "logLevel": "debug",
  "newVersionNotification": "stable"
}

Here's some tests:

# ignored, no headers
curl -ix http://127.0.0.1:8000 https://jsonplaceholder.typicode.com/posts

# included, matching header
curl -ix http://127.0.0.1:8000 -H 'x-app:contoso-intranet' https://jsonplaceholder.typicode.com/posts

# ignored, correct header but incorrect value
curl -ix http://127.0.0.1:8000 -H 'x-app:contoso-foo' https://jsonplaceholder.typicode.com/posts

# included, matching header with any value
curl -ix http://127.0.0.1:8000 -H 'x-contoso:bar' https://jsonplaceholder.typicode.com/posts

Could you please confirm that this setup is working for you?

MChez commented 2 months ago

Hi @waldekmastykarz

Sorry for the slow response, I've been snowed under with work. I will try to give this a test as soon as I can.

Regards, Mark

MChez commented 2 months ago

Hi @waldekmastykarz, I've had a chance to do some testing and I can confirm your test results using your config via PowerShell:

# ignored, no headers
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts

# included, matching header
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet'; }

# ignored, correct header but incorrect value
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-foo'; }

# included, matching header with any value
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-contoso' = 'bar'; }

I expanded the tests and there was one test that returned an unexpected result:

# This fails. Should it?
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-contoso' = $null; }

I didn't check to see if Invoke-WebRequest included the header in the call. It might have dropped it since it had a null value.

I then changed the config slightly and ran some more tests. I did find something a little strange.

The config:

{
  "$schema": "https://raw.githubusercontent.com/microsoft/dev-proxy/main/schemas/v0.18.0/rc.schema.json",
  "plugins": [
    {
      "name": "RetryAfterPlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/dev-proxy-plugins.dll"
    },
    {
      "name": "GenericRandomErrorPlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/dev-proxy-plugins.dll",
      "configSection": "genericRandomErrorPlugin"
    }
  ],
  "urlsToWatch": [
    "https://jsonplaceholder.typicode.com/*",
    "https://v2.jokeapi.dev/joke/Any*"
  ],
  "genericRandomErrorPlugin": {
    "errorsFile": "devproxy-errors.json"
  },
  "filterByHeaders": [
    // request must contain the x-app: contoso-intranet header
    {
      "name": "x-app",
      "value": "contoso-intranet"
    },  
    // or the x-contoso header with any value
    {
      "name": "x-contoso",
      "value": ""
    },
    // or the x-characters header with all valid header characters
    {
      "name": "x-characters",
      "value": "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~ %20,%21,%22,%23,%24,%25,%26,%27,%28,%29,%2A,%2B,%2C,%2F,%3A,%3B,%3D,%3F,%40,%5B,%5D"
    },
    // or x-app: contoso-intranet-2 header
    {
      "name": "x-app",
      "value": "contoso-intranet-2"
    },
  ],
  "rate": 50,
  "logLevel": "debug",
  "newVersionNotification": "stable"
}

Note: Two URLs captured and two x-app header values captured.

I ran the following tests in PowerShell:

# ignored, no headers
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts?v2

# included, matching header
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet '; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = ' contoso-intranet'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet-2'; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = 'contoso-intranet'; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = 'contoso-intranet '; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = ' contoso-intranet'; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = 'contoso-intranet-2'; }

# included when two of the same headers are included in the config, but not a matching header. 
# I didn't expect these to be included. Is this correct?
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet-'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet-21'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intranet-2 abcd'; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = 'contoso-intranet-'; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = 'contoso-intranet-2'; }
Invoke-WebRequest -Uri https://v2.jokeapi.dev/joke/Any?safe-mode -Headers @{ 'x-app' = 'contoso-intranet-2 abcd'; }

# ignored, correct header but incorrect value
# I'm not sure if these should be case-insensitive!
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-foo'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-intrane'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'ontoso-intranet'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contosointranet'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'Contoso-intranet'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'contoso-Intranet'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-app' = 'CONTOSO-INTRANET'; }

# included, matching header with any value
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-contoso' = 'bar'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-contoso' = 'BAR'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts?v1 -Headers @{ 'x-contoso' = 'Barbara Ann'; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-contoso' = ''; }
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts?v1 -Headers @{ 'x-contoso' = '-'; }
# RFC 3986 section 2.3 Unreserved Characters (January 2005): ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~ 
# RFC 3986 section 2.2 Reserved Characters (January 2005): !#$&'()*+,/:;=?@[]
# Common ASCII Hex values: %20,%21,%22,%23,%24,%25,%26,%27,%28,%29,%2A,%2B,%2C,%2F,%3A,%3B,%3D,%3F,%40,%5B,%5D
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts?v1 -Headers @{ 'x-contoso' = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~ %20,%21,%22,%23,%24,%25,%26,%27,%28,%29,%2A,%2B,%2C,%2F,%3A,%3B,%3D,%3F,%40,%5B,%5D'; }

# This fails. Should it?
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-contoso' = $null; }

# not tested for headers, wrong URL
Invoke-WebRequest -Uri https://dog.ceo/api/breeds/image/random -Headers @{ 'x-contoso' = 'bar'; }
Invoke-WebRequest -Uri https://api.chucknorris.io/jokes/random -Headers @{ 'x-contoso' = 'bar'; }

# included, matching header
Invoke-WebRequest -Uri https://jsonplaceholder.typicode.com/posts -Headers @{ 'x-characters' = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~ %20,%21,%22,%23,%24,%25,%26,%27,%28,%29,%2A,%2B,%2C,%2F,%3A,%3B,%3D,%3F,%40,%5B,%5D'; }

Everything was consistent with your tests, but the strange results are caused by the inclusion of two filters for the same header: x-app. When I added the second filter I thought it would only capture calls to either URL with either x-app:contoso-intranet or x-app:contoso-intranet-2, but it started capturing any x-app header that starts with contoso-intranet, e.g. contoso-intranet-, contoso-intranet-2 abcd.

I don't have a specific test case in mind for the multiple x-app filters, I just thought it might be something that someone needs.

I hope this lot helps.

Regards, Mark

waldekmastykarz commented 2 months ago

Thank you for extensive tests!

Everything was consistent with your tests, but the strange results are caused by the inclusion of two filters for the same header: x-app. When I added the second filter I thought it would only capture calls to either URL with either x-app:contoso-intranet or x-app:contoso-intranet-2, but it started capturing any x-app header that starts with contoso-intranet, e.g. contoso-intranet-, contoso-intranet-2 abcd.

This is by design, because right now we're checking if the header contains the specified value:

image

Are you saying that an exact match would be more useful? We've been thinking that a contains match would give you more flexibility in cases, when you can't tell the exact value or the value changes (eg. SDK version).

Regarding the $null value, I think the failure is expected because there's no way to include a $null value on a header, so my gut feeling tells me that PowerShell is dropping the header from the request before sending it to the API.

MChez commented 2 months ago

That all makes sense to me; I agree with you on both accounts.

contains is the right way to go. My tests would have been better if I completely changed the second x-app filter entry. Something like this:

  "filterByHeaders": [
    // request must contain the x-app: contoso-intranet header
    {
      "name": "x-app",
      "value": "contoso-intranet"
    },  
    ...
    // or x-app: contoso-intranet-2 header
    {
      "name": "x-app",
      "value": "tailspin-api"
    },
  ],
waldekmastykarz commented 2 months ago

Thank you for confirming. We'll ship this in public preview in the next few days. Thank you for working with us!