Closed noamross closed 6 months ago
I note raw GitHub downloads have etags (e.g., https://raw.githubusercontent.com/codemeta/codemeta/66284de845a413414be98e63d8eeee3569619de2/codemeta.json has etag: W/"77fe8b497c0a90667e5cf550884dc23a1bef2be9b9ae68d62442dad0cf6099a0"
), though they don't appear to be the commit or object blob but something else.
The Last-Modified
response header would also be useful and could be derived from Dolt data. It's as least as commonly used to check and is often a fallback for etags.
Hi @noamross , I added the etag for downloads.
for links include a commit hash, the etag is an encoded hash of the commit.
for links include a branch name, the etag is an encoded hash of the head commit of the branch.
in both cases, Cache-control
is set to be immutable
.
This is great, thank @liuliu-dev!
Is your feature request related to a problem? Please describe. Our build system, like many others, makes use of the common HTTP response header
eTag
to check if data is updated, and HEAD requests to check whether to download data. DoltHub download responses (e.g., https://www.dolthub.com/csv/ecohealthalliance/wahisdb/main/people_role_relation) don't have this or other cache information in the header.Describe the solution you'd like It would be excellent if CSV/ZIP/Excel and other direct downloads from repositories have HTTP response headers with version information in them. It makes natural sense for the
eTag
value to be the commit hash of what is pulled. (While a table might be the same between commits, I'm unsure if a table-level hash is a concept in Dolt like an object/blob hash is in git). This is most important for links that reference the HEAD or a branch of a database, but it seems it can apply to everything.It also may be a good idea, for links that directly reference a commit hash (e.g. https://www.dolthub.com/csv/ecohealthalliance/wahisdb/usnlgpjqtcgl1fpd0g13mls07p155hga), to set
Cache-control
headers likeimmutable
.Describe alternatives you've considered In our own work we have workarounds like making an API request to get the commit hash of the HEAD or a branch and trigger a download on update.
Another possibility would be for links to HEAD or a branch or tag to act as redirects to the fixed commit-based URL, but an eTag seems simpler and a standard that applies across sites.