LGTM.com - URL sanitization inconsistency

github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security

MIT License

7.51k stars 1.49k forks source link

So, the reason there is a difference between the two functions is that ://clbin.com/ is recognised as a probable URL due to the presence of the .com (and various other parts). We do not recognise .st in the same way, and hence we do not match ://0x0.st as a URL. We would recognise it if it were instead something like http://0x0.st.

You can see the code responsible for these heuristics here: https://lgtm.com/query/rule:1507386916281/lang:python/

I think this counts as a false positive. Even though there is matching on URL-like strings going on, these strings are not being used to e.g. redirect to the given URL, and so they are not really unsafe. We'll look into how to fix the query to eliminate these false positives.

github / codeql

LGTM.com - URL sanitization inconsistency #2360