evoldoers / biomake

GNU-Make-like utility for managing builds and complex workflows
BSD 3-Clause "New" or "Revised" License
102 stars 9 forks source link

Use If-Modified-Since & Content-MD5 HTTP headers to examine remote dependencies #62

Closed ihh closed 6 years ago

ihh commented 6 years ago

In thinking about using biomake to build a continuous-integration data aggregation pipeline (@cmungall), I'm imaging a model for database adapters that basically gives you a macro of the form:

BIOMAKE_CURL(downloaded_file_path,url_of_file)

At a crude level this can just be imagined as expanding into the following Makefile recipe (indeed, one could retain legacy compatibility if using this with GNU Make by defining BIOMAKE_CURL to expand into something like this):

downloaded_file_path:
    curl --output $@ url_of_file

However, behind the scenes, biomake will attempt to propagate the dependency check across the connection, either by

The Content-MD5 idea is a bit dicey because it may not be well-supported (e.g. Apache can do it but only by computing the MD5 hash every time; it doesn't cache it). We could pretty easily whip up a node-express plugin that would cache the hash, I expect.

ihh commented 6 years ago

Maybe you'd also offer a three-argument form

BIOMAKE_CURL_OPTS(curl_opts,downloaded_file_path,url_of_file)

equivalent to

downloaded_file_path:
    curl curl_opts --output $@ url_of_file

so you could do things like follow redirects, use authentication, etc.