Simon-Initiative / course-digest

Tool to produce a summary or digest of OLI course package contents
MIT License
2 stars 0 forks source link

[BUGFIX] URL mismatch when special chars in filenames [MER-2514] #191

Closed andersweinstein closed 1 year ago

andersweinstein commented 1 year ago

If a source filename has special characters like the combining tilde used in Spanish, migration tool will generate and record a URL-encoded URL for it. For example, señora.jpg => sen%CC%83ora.jpg where CC 83 is the UTF-8 encoding of the combining tilde. However, the Amazon S3 API apparently takes a non-URL-encoded "key" which may be any UTF-8 string. Including special characters may cause trouble for client applications, but is allowed in an S3 key.

The upload tool was generating a key from the encoded URL. The result in the above case is that the URL needed to access the uploaded file would be sen%25CC%2583ora.jpg to match the literal percent characters in the S3 key (and this was the value returned from the API as the data.location). So the generated URLs fail to match the resulting upload location of the file whenever characters requiring URL-encoding are part of the filename.

This changes the code to undo URL encoding when forming the S3 key.