aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.68k stars 651 forks source link

Implement a library function to decode strings with octal escape codes #2705

Closed ivankatliarchuk closed 4 months ago

ivankatliarchuk commented 5 months ago

Describe the feature

The current AWS SDK for Go lacks a utility method to conveniently decode octal escape sequences present in domain names retrieved from AWS services like Route 53. This can be cumbersome for developers who need to handle these characters manually, potentially leading to errors and inconsistencies.

Documentation for route53 domane names

We propose the addition of a utility method within the AWS SDK Go library that simplifies the process of decoding octal escape sequences within domain names. This method could be named something like DecodeOctalEscapedString or octalnormaliser and would accept a string containing the encoded domain name and return the decoded string with the original characters restored.

Benefits:

Open Questions:

Use Case

Current external DNS issue https://github.com/kubernetes-sigs/external-dns/pull/4582/files

At the moment without this utility function, we need to create own flavour of un-escape functionality that may not necessarily be always up-to-date and most efficient.

Proposed Solution

Proposed solution

func convertOctalToAscii(input string) string {
    if !containsOctalSequence(input) {
        return input
    }
    var result strings.Builder
    for i := 0; i < len(input); i++ {
        if input[i] == '\\' && i+3 < len(input) {
            octalStr := input[i+1 : i+4]
            if octal, err := strconv.ParseInt(octalStr, 8, 8); err == nil {
                fmt.Println("string:", octalStr, " char:", byte(octal))
                result.WriteByte(byte(octal))
                i += 3 // Skip the next 3 characters (the octal code)
            } else {
                result.WriteByte(input[i])
            }
        } else {
            result.WriteByte(input[i])
        }
    }
    return result.String()
}

var octalCharTable = map[string]byte{
    "041": byte(33), // exclamation mark
    "042": byte(34), // double quote
    "043": byte(35), // # number sing
    "044": byte(36), // $ dollar
    "045": byte(37), // % percentage
    "046": byte(38), // & ampersand
    "047": byte(39), // single quote
    "050": byte(40), // ( left parenthesis
    "051": byte(41), // ) right parenthesis
    "052": byte(42), // * asterisk
    "053": byte(43), // + plus
    "054": byte(44), // , comma
    "057": byte(47), // / forward slash
    "072": byte(58), // : colon
    "073": byte(59), // ; semicolon
    "074": byte(60), // < less than sign
    "075": byte(61), // = equal
    "076": byte(62), // > greater than sign
    "077": byte(63), // question mark
    "100": byte(64), // @ at symbol
    "133": byte(91), // [ left square bracket
    "134": byte(92), // ] right square bracket
    "135": byte(93), // ^ caret
    "136": byte(94), // _ underscore
    "140": byte(96), // ` backtick
    "173": byte(123), // { left curly brace
    "174": byte(124), // | vertical baar
    "175": byte(125), // } right curly brace
    "176": byte(126), // tilda
}

func convertOctalToAsciiWithHashTable(input string) string {
    if !containsOctalSequence(input) {
        return input
    }
    var result strings.Builder
    for i := 0; i < len(input); i++ {
        if input[i] == '\\' && i+3 < len(input) {
            octalStr := input[i+1 : i+4]
            if char, ok := octalCharTable[octalStr]; ok {
                result.WriteByte(char)
                i += 3 // Skip the next 3 characters (the octal code)
            } else {
                result.WriteString(input[i : i+4]) // Keep the entire sequence if not found
            }
        } else {
            result.WriteByte(input[i])
        }
    }
    return result.String()
}

// validateDomainName checks if the domain name contains valid octal escape sequences.
func containsOctalSequence(domain string) bool {
    // Pattern to match valid octal escape sequences
    octalEscapePattern := `\\[0-3][0-7]{2}`
    octalEscapeRegex := regexp.MustCompile(octalEscapePattern)
    return octalEscapeRegex.MatchString(domain)
}

with tests

func TestConvertOctalToAscii(t *testing.T) {
    tests := []struct {
        name     string
        input    string
        expected string
    }{
        {
            name:     "Characters escaped !\"#$%&'()*+,-/:;",
            input:    "txt-\\041\\042\\043\\044\\045\\046\\047\\050\\051\\052\\053\\054-\\057\\072\\073-test.example.com",
            expected: "txt-!\"#$%&'()*+,-/:;-test.example.com",
        },
        {
            name:     "Characters escaped <=>?@[\\]^_`{|}~",
            input:    "txt-\\074\\075\\076\\077\\100\\133\\134\\135\\136_\\140\\173\\174\\175\\176-test2.example.com",
            expected: "txt-<=>?@[\\]^_`{|}~-test2.example.com",
        },
        {
            name:     "No escaped characters in domain",
            input:    "txt-awesome-test3.example.com",
            expected: "txt-awesome-test3.example.com",
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            actual := convertOctalToAscii(tt.input)
            assert.Equal(t, tt.expected, actual)

            actualv := convertOctalToAsciiWithHashTable(tt.input)
            assert.Equal(t, tt.expected, actualv)
        })
    }
}

Other Information

No response

Acknowledgements

AWS Go SDK V2 Module Versions Used

github.com/aws/aws-sdk-go-v2 v1.30.1

External DNS is still on v1. v1 does not accept feature requests, so this sound like a righ place to ask for it.

Go version used

1.22

ivankatliarchuk commented 5 months ago

Just on high level in AWS Route53 I can create TXT records

ivan-!"#$%&'()*+,-/:;-test.example.com
ivan-<=>?@[\]^_`{|}~test2.example.com

And the result from library is

Name: ivan-\\041\\042\\043\\044\\045\\046\\047\\050\\051\\052\\053\\054-\\057\\072\\073-test.example.com
Name: ivan-\\074\\075\\076\\077\\100\\133\\134\\135\\136_\\140\\173\\174\\175\\176test2.example.com

It would be great if aws sdk ListResourceRecordSets have an option to return a normalised version too, probably not something that could be accepted do. Please consider at least a utility function.

lucix-aws commented 4 months ago

Doesn't strconv.Unquote do this?

package main

import (
    "fmt"
    "strconv"
)

func main() {
    v := "txt-\\041\\042\\043\\044\\045\\046\\047\\050\\051\\052\\053\\054-\\057\\072\\073-test.example.com"
    fmt.Println(strconv.Unquote("\"" + v + "\""))
}
txt-!"#$%&'()*+,-/:;-test.example.com <nil>

We're generally very against one-off string utilities like this, there's no real place for them.

ivankatliarchuk commented 4 months ago

Thank you. Currently testing this

github-actions[bot] commented 4 months ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.