neomaclin / sbt-simple-url-update

MIT License
9 stars 10 forks source link

More robust URL replacement #4

Open jroper opened 10 years ago

jroper commented 10 years ago

If I have an asset, say images/logo.png, and then I have a URL on a different server that ends in the same path, http://example.com/images/logo.png, then the current implementation will fingerprint the example.com logo - not good.

A simple solution to this would be to only replace URLs in quotes (both single and double). In CSS, this has the limitation that unquoted URLs won't get replaced, but I think it's best practice to only use quoted URLs, so that's probably fine. In JS, I can't think of a place where you would have a URL that wasn't in a string by itself.

But then we have a problem, relative URLs are not correctly fingerprinted in CSS files, and it's very common to use relative URLs in CSS files - particularly when you're deploying to a CDN which might server assets from a different path, then you have to use relative URLs, eg ../images/foo.png.

So my suggestion is that you properly parse, at very least, the CSS, to find URLs, then process those URLs to work out which asset they are referring to locally. I've got some example regexps that I wrote some time ago here:

  val QuotedString = """(?:"[^"]*")|(?:'[^']*')"""

  val UrlRegex = ("""url\(\s*(""" + QuotedString + """|[^"'()\s\\]*)\s*\)""").r
  val ImportRegex = ("""@import\s+(""" + QuotedString + """)\s*;""").r

This will ensure that only URLs inside of url(...) or after an @import statement are matched. Another nice thing about this is it means you only run two regular expressions over each file, rather than running one regexp for every asset. It does make replacement a bit harder (you need to go through and manage a StringBuilder, using the start/end methods on the match group to transfer the parts of the file in between etc), but it means you can easily handle relative URLs, include URLs that traverse up like ../images/foo.png, and also absolute path URLs correctly.

I don't think there's an equivalent solution for JavaScript, but at least in JavaScript you generally don't need to deal with relative URLs in the same way that you need to in CSS, it should be sufficient there to just replace quoted URLs.

neomaclin commented 10 years ago

Thanks for the suggestions, I have been away from this plugin for a few weeks. And the plugin was meant to be a quick hack for the problems our team have on hand. For sure will come back and revisit this.