aws / aws-sdk-go

AWS SDK for the Go programming language.
http://aws.amazon.com/sdk-for-go/
Apache License 2.0
8.6k stars 2.06k forks source link

CloudFront Invalidation Input does not escape paths properly. #223

Closed chpapa closed 9 years ago

chpapa commented 9 years ago

Say if I pass a path "/hello world.html", the Request does not escape the space in the URL properly and result in this error.

"Error: InvalidArgument Your request contains one or more invalid invalidation paths."

lsegal commented 9 years ago

@chpapa thanks for opening this issue. This is expected behavior in the SDK and you will see this same error in the Ruby SDK, JS SDK, CLI, and others. You must ensure that your paths are properly URL encoded in the format that you want. Note that even the console UI does not allow spaces (a space delimits a new invalidation), and if you wanted to do this there you would also have to provide an encoded URL.

chpapa commented 9 years ago

Thanks. Though it sounds kind of weird to me that the SDK won't handle the escape. Will close this issue. Any chance if you can point me to any reference of how to implement the encode properly? Seems there are nothing compliance to the encode CF invalidation need in the standard library.

lsegal commented 9 years ago

@chpapa the SDK maps operations directly to the API in a 1:1 fashion so as to be as efficient and consistent with the service operation as possible. As such, we don't focus on making semantic manipulations of your data. If you believe that CloudFront should handle spaces in a specific way, that's something that the API itself should be handling, in other words, it should escape the space on its end. If you think that this feature is useful, I would suggest opening a thread on the Amazon CloudFront forums to request this functionality.

That said, the other reason we would not want to escape on your behalf is that the paths passed into CloudFront are considered literal strings of paths, not URI components, and they are not interpreted in any way. In other words, hitting CloudFront with the path http://../foo would generate a different cache entry from http://../foo? (note the question-mark) and would have to be invalidated as separate entries, even though semantically they are equivalent URLs. If our SDK were to interpret your path, there is a chance we may interpret incorrectly. In your case, a space can be encoded as "%20" or as "+", and both of these paths (a+b vs a%20b) are considered different URLs in CloudFront and must be invalidated separately.

Seems there are nothing compliance to the encode CF invalidation need in the standard library.

You can use something like url.ParseRequestURI or url.QueryEscape (playground link), but note that each interpretation may generate a different URL, and the one that you select to invalidate depends on how your application exposes these URLs. It's possible you may want to invalidate both formats.

FYI I found this documentation from CloudFront, which echoes the idea that encoding the URL is a consumer responsibility (doc link):

If the path includes non-ASCII characters or unsafe characters as defined in RFC 1783 (http://www.ietf.org/rfc/rfc1738.txt), URL-encode those characters. Do not URL-encode any other characters in the path, or CloudFront will not invalidate the old version of the updated object.

chpapa commented 9 years ago

Thanks a lot for the clarification @lsegal

haidargit commented 2 weeks ago

I had the same experience as above,

this reply is not related to any issues. just some info. ✌🏻

if we want to validate specific file(s) that contain space " ", we must escape or encode that space character.

example path to invalidate: /hello/world/earth/awesome day needs awesome coffee.zip

using invalidation on AWS CLI, the invalidation pattern or path should be like this: /hello/world/earth/awesome%20day%20needs%20awesome%20coffee.zip

(AWS can modify or adjust its services in the future)