I worked on a way of improving the safety of Sproutkit's domain whitelist. In chat, I suggested parsing each URL as a URL object and comparing the domain, but I didn't take into account that we're also allowing domains like Imgur with specific paths/images.
Your proposition of having strings in the whitelist and converting them to regular expressions seems like the appropriate choice to resolve the problem -- so this is the method that I implemented here
Btw, feel free to close this PR if you want to implement the fix by yourself on stream :)
This PR just serves as a demo, the code is not super clean and behaviors can be refined
Detailed explanations:
Every URL in the whitelist is a string that is going to be converted to a regular expression. Since slashes are escaped by default by the RegExp constructor, we're not escaping them. Since I don't think dots are going to be used in URL regexps and that they are pretty common in URLs, we're also not escaping them (this will be done in code). /!\ All other special characters are interpreted by the RegExp constructor
In the code, the first step is to escape the dots
Then, we want to check if we need to append a trailing character to make the regexp safe
If the URL includes a path after the domain:
If the path ends with /, append nothing -> we're allowing any image in that directory or its subdirectories
If the path doesn't end with /, then its probably a specific image/set of image that we want to allow so we append $
If the URL just includes a domain name:
Append a / to avoid URLs starting with the same string to match
Note: All URLs must be full URLs (with protocol and ://)
Finally, we add a caret ^ at the beginning of the string and pass it to the RegExp constructor
Recap of the rules of whitelist URLs:
URL must start with <protocol>://
URL path must end with / if we want to allow any image in a specific directory or its subdirectories
Note: Trailing / is not required if the URL is just <protocol>://<domain name> since / will be automatically added
Hi CJ!
I worked on a way of improving the safety of Sproutkit's domain whitelist. In chat, I suggested parsing each URL as a URL object and comparing the domain, but I didn't take into account that we're also allowing domains like Imgur with specific paths/images.
Your proposition of having strings in the whitelist and converting them to regular expressions seems like the appropriate choice to resolve the problem -- so this is the method that I implemented here
Btw, feel free to close this PR if you want to implement the fix by yourself on stream :) This PR just serves as a demo, the code is not super clean and behaviors can be refined
Detailed explanations:
/
, append nothing -> we're allowing any image in that directory or its subdirectories/
, then its probably a specific image/set of image that we want to allow so we append$
/
to avoid URLs starting with the same string to match^
at the beginning of the string and pass it to the RegExp constructorRecap of the rules of whitelist URLs:
<protocol>://
/
if we want to allow any image in a specific directory or its subdirectories/
is not required if the URL is just<protocol>://<domain name>
since/
will be automatically added