aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.66k stars 640 forks source link

Incorrect `A-z` range in `smithy.api#pattern` validations #2803

Open gdavison opened 1 month ago

gdavison commented 1 month ago

Acknowledgements

Describe the bug

A number of smithy.api#pattern validations in the codegen models have an incorrect A-z character range

Some examples:

https://github.com/aws/aws-sdk-go-v2/blob/e211ac0d542b6529cb7f81c05d230e5a2a6f91d6/codegen/sdk-codegen/aws-models/fsx.json#L2196-L2206

https://github.com/aws/aws-sdk-go-v2/blob/e211ac0d542b6529cb7f81c05d230e5a2a6f91d6/codegen/sdk-codegen/aws-models/fsx.json#L12375-L12384

https://github.com/aws/aws-sdk-go-v2/blob/e211ac0d542b6529cb7f81c05d230e5a2a6f91d6/codegen/sdk-codegen/aws-models/groundstation.json#L4144-L4153

Regression Issue

Expected Behavior

The range should be A-Z

Current Behavior

The range is A-z, which includes [, \, ], ^, _, and ` in addition to the upper- and lower-case letters

Reproduction Steps

Do a case-insensitive search for A-z in the repo

Possible Solution

No response

Additional Information/Context

No response

AWS Go SDK V2 Module Versions Used

Release 2024-09-24

Compiler and Version used

N/A

Operating System and version

N/A

RanVaknin commented 1 month ago

Hi @gdavison ,

Thanks for reaching out. Can you explain a little more about why you think the regex should be changed? I'm not too familiar with fsx, but at least for jsonString, special characters like [,],{ etc are expected. In other words, I need to understand why you think this regex is wrong in order to efficiently raise this to the service team on your behalf.

If I understand correctly, you are saying the regex is too "loose", meaning you are not blocked in any way, but rather trying to improve the API?

Thanks again, Ran~

gdavison commented 1 month ago

In the examples from FSx, the regexes are clearly typos, since the A-z is immediately followed or preceded by a-z. This is also the case in most of the other regexes findable by doing a case-sensitive search.

For the Ground Station JsonString, the regex already includes the characters [, ], and _. The regex also excludes a large set of valid JSON characters which can be included in string values (the JSON spec says that a string can contain Unicode characters between '0020' (space) and '10FFFF'). I'm not familiar with Ground Station, so perhaps this doesn't affect it in practice. Regardless, if the intent is to use A-z to include both upper- and lower-case letters, it should be A-Za-z instead.

This isn't a blocker for us at the moment, though I have come across a number of cases where we have created validations based on API documentation which has included this incorrect format. Since we currently create the validations manually, we have been able to catch these errors.

We are doing some planning around automatically generating code using the Smithy specifications. We will have to add code to correct the regexes, rather than relying on the specifications to be accurate.

bhavya2109sharma commented 1 month ago

Hello @gdavison

Thanks for the detailed explanation. I have reached out to the FSx (#P158594413) and Ground Station (#P158601699) Service Teams with the respective internal tickets.

I will update it here once I will get follow up on the request.

Thanks, Bhavya

gdavison commented 1 month ago

Hi @bhavya2109sharma, FSx and Ground Station were only a few examples. Here are the others:

https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/imagebuilder.json#L11847

Twice on the one line: https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/iottwinmaker.json#L3329

https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/nimble.json#L1715

https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/resiliencehub.json#L2435

Three times on the one line: https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/securitylake.json#L62

https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/timestream-influxdb.json#L1228

https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/timestream-influxdb.json#L1419

Three times on the one line: https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/workspaces.json#L35

Three times on the one line: https://github.com/aws/aws-sdk-go-v2/blob/be2f3e03a2d633228d04f560a87e5459655083e5/codegen/sdk-codegen/aws-models/workspaces.json#L8271

bhavya2109sharma commented 1 month ago

Hello @gdavison,

Thanks for pointing all these issues.

I have reached out to the following teams with respective tickets :

I will update it here once I will get follow up on the request.

Thanks, Bhavya