solo-io / envoy-gloo

Apache License 2.0
25 stars 12 forks source link

Extraction may not work correctly if the backing data is also modified during the transformation #367

Open andy-fong opened 3 days ago

andy-fong commented 3 days ago

Example gloo control plane config config that could trigger this issue:

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: dev-vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - foo.com
    routes:
    - matchers:
      - regex: /ab/c.*
      options:
        stagedTransformations:
          early:
            requestTransforms:
            - requestTransformation:
                logRequestResponseInfo: true
                transformationTemplate:
                  advancedTemplates: true
                  extractors:
                    mypath:
                      header: ':path'
                      regex: ^\/ab\/c\/([^/]+)(.*)$
                      subgroup: 1
                    mypath2:
                      header: ':path'
                      regex: ^\/ab\/c\/([^/]+)(.*)$
                      subgroup: 2
                  headers:
                    host-header:
                      text: '{{ extraction("mypath") }}'
                    :path:
                      text: '{{ extraction("mypath2") }}'
        timeout: 31s

Explanation: Here we are trying to extract part of the path and put that into a header but also remove that part in the path. The extraction in this mode is stored in a string_veiw that requires the backing data (in this case, the :path header string) to be unchanged.

Workaround: 1) extracting information and modifying the same header within the same stage should be discourage. The order of what happens first may not be 100% guarantee in the headers modification phase. In this case, the evidence shows that :path get modified first before host-header is created even host-header comes first in the config. So, to work around this, it's best to extract the host information in the early stage and modify :path in the regular stage:

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: dev-vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - foo.com
    routes:
    - matchers:
      - regex: /ab/c.*
      options:
        stagedTransformations:
          early:
            requestTransforms:
            - requestTransformation:
                logRequestResponseInfo: true
                transformationTemplate:
                  advancedTemplates: true
                  extractors:
                    mypath:
                      header: ':path'
                      regex: ^\/ab\/c\/([^/]+)(.*)$
                      subgroup: 1
                  headers:
                    host-header:
                      text: '{{ extraction("mypath") }}'
          regular:
            requestTransforms:
            - requestTransformation:
                logRequestResponseInfo: true
                transformationTemplate:
                  advancedTemplates: true
                  extractors:
                    mypath2:
                      header: ':path'
                      regex: ^\/ab\/c\/([^/]+)(.*)$
                      subgroup: 2
                  headers:
                    :path:
                      text: '{{ extraction("mypath2") }}'
        timeout: 31s

2) If the extraction has to be in the same stage, using REPLACE_ALL mode and replacing with subgroup might be able to work around this. The following is untested:

                    mipath:
                      header: :path
                      mode: REPLACE_ALL
                      regex: ^\/gw\/h\/([^/]+)(.*)$
                      replacementText: '$1'
                    mipath2:
                      header: :path
                      mode: REPLACE_ALL
                      regex: ^\/gw\/h\/([^/]+)(.*)$
                      replacementText: '$2'

Internal slack context: https://solo-io-corp.slack.com/archives/CEDCS8TAP/p1727447571096119

nfuden commented 3 days ago

Note that this affects all header and metadata transformers