kubernetes-sigs / azuredisk-csi-driver

Azure Disk CSI Driver
Apache License 2.0
147 stars 190 forks source link

Allow tag values containing commas and equal sign #2323

Closed dominiquehunziker closed 21 hours ago

dominiquehunziker commented 6 months ago

Currently, it is not possible to create tags with a value that contains , or = as this breaks the parsing logic for the tags parameter.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: managed-premium-retain
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_ZRS
  tags: 'tag-1="aGVsbG8=",tag-2="value-2, value-3"'
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

It would be great, if support for values with , and = can be added. For the previous (non functioning example) the desired result would be

{
  "tag-1": "aGVsbG8=",
  "tag-2": "value-2, value-3"
}

where tag-1 could be a base64 encoded value with = padding or tag-2 could be a comma separated list.

I'm uncertain what would be possible solutions as changing the parsing logic might change existing behavior. Possible options I can imagine are:

andyzhangx commented 6 months ago

this should be fixed by https://github.com/kubernetes-sigs/azuredisk-csi-driver/pull/2246 in v1.30.1, v1.29.5, what is your csi driver version? @dominiquehunziker

andyzhangx commented 6 months ago

nvm, that looks like k8s yaml config parsing issue

andyzhangx commented 6 months ago

what about this solution? if tags parameter only contains a base64 encoded string, it would do base64 decoding first, e.g. if it's tags: dmFsdWUtMiwgdmFsdWUtMwo=, the driver would decode as tags: "value-2, value-3"

dominiquehunziker commented 6 months ago

I'm not sure I follow. In your example the tags parameter now does not represent a key/value mapping (or it now only contains 2 keys without a value)?

The problem is not the encoding of the tags parameter (plain text vs. base64), but how the value is parsed. To my understanding if you first base64 decode the tags parameter you would then still apply the same old parsing logic, which would lead to the same result, no? Or did I misunderstand something?

The base64 I just picked up as an example where one might have a = character in the values. The tag value could also be something like 1 == 1. My primary interest is in allowing commas in the tag value.

andyzhangx commented 6 months ago

I think we could add a new parameter, e.g. base64Tags: dmFsdWUtMiwgdmFsdWUtMwo=, that should work?

dominiquehunziker commented 6 months ago

From my perspective the function which would have to be modified (if the existing field is used) is ConvertTagsToMap. What I'm looking for would translate into the testcase in TestConvertTagsToMap

        {
            desc: "new testcase",
            tags: "???",  // what needs to be the input?
            expectedOutput: map[string]string{
                "tag-1": "aGVsbG8=",
                "tag-2": "value-2, value-3",
            },
            expectedError: false,
        },

Naturally, a new parameter can be introduced in which case a new function with equivalent semantic (convert string to map) would be required. This could be the proposed parameter base64Tags, but I don't see what the benefit of base64 encoding is in the context of parsing the string to a map. Base64 encoding helps if there is binary data, but tag keys/values I would not expect to consist of binary data. Base64 encoding in the end would convert string to []byte which still would have to be somehow converted to map[string]string.

My proposal was to either enhance ConvertTagsToMap to allow escaping , and =, or allow to quote key/values. However, I would consider this a breaking change as this will change the behavior of the field.

Alternatively, a new parameter with a different parsing logic or a parameter which can be used to configure the parsing logic of the existing tags parameter, i.e., my idea with the tagsFormat could look like

func NewConvertTagsToMap(tags, tagFormat string) (map[string]string, error) {
    switch tagFormat {
        case "Yaml":
            return ConvertYamlTagsToMap(tags)
        default:
            return ConvertTagsToMap(tags)  // default to old parsing logic
    }
}

func ConvertYamlTagsToMap(tags string) (map[string]string, error) {
    m := make(map[string]string)
    err := yaml.Unmarshal([]byte(tags), m)
    return m, err
}
andyzhangx commented 5 months ago

@dominiquehunziker what about adding a new parameter tagValueDelimiter, by default, it's using comma to delimite, and user could self define the delimiter, I think that's more compatible

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: managed-premium-retain
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_ZRS
  tags: 'tag-1="aGVsbG8=";tag-2="value-2, value-3"'
  tagValueDelimiter: ";"
dominiquehunziker commented 5 months ago

That sounds like a good idea and should address my need for comma separated tag values.

Regarding =, do think splitting on the first = and leave other = in a key value pair would work? Meaning line 90-95 would become

    for _, tag := range s {
        kv := strings.SplitN(tag, TagKeyValueDelimiter, 2)
        if len(kv) != 2 {
            return nil, fmt.Errorf("Tags '%s' are invalid, the format should like: 'key1=value1,key2=value2'", tags)
        }
                ...

Or would you leave the logic as is? For my use-case I don't currently require = in tag values, therefore, I'm fine if this is not added.

andyzhangx commented 4 months ago

I think adding a new parameter tagValueDelimiter should be more graceful though the above trick may work. we will work on this feature in July, sorry for the delay since the whole azure team are working on security wave these two months.

andyzhangx commented 4 months ago

/kind feature

andyzhangx commented 3 months ago

fixed in https://github.com/kubernetes-sigs/azuredisk-csi-driver/releases/tag/v1.30.2

dominiquehunziker commented 1 week ago

@andyzhangx Thank your for the information. I had a chance to test the new feature. Unfortunately, it does not work and the error is:

invalid parameter tagValueDelimiter in storage class

IMO the issue is that all input parameters are first converted to lowercase in ParseDiskParameters before they are matched against the parameter names. In the case of the new parameter, however, the name is camelCase (here).

Shall I create a new issue for this?

andyzhangx commented 1 day ago

@dominiquehunziker wait, if you set tagValueDelimiter: ";", the lowercase or uppercase does not matter, right? following parameter settings should be working, right?

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: managed-premium-retain
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_ZRS
  tags: 'tag-1="aGVsbG8=";tag-2="value-2, value-3"'
  tagValueDelimiter: ";"
andyzhangx commented 1 day ago

nvm, I found the issue, will fix it in next patch version