Open eggyal opened 1 year ago
Hey @eggyal, I would definitely accept a PR for this. I like the idea of using HTTP headers, so I think that should be the first priority. It would also be nice to allow the user to directly specify the format, so if that's straightforward enough to do in the same PR, please go ahead. I'm not opposed to detection "magic" as a fallback as well... that could always be an optional feature of this crate.
I see that
cached-path
currently determines how to extract an archive according to its filename extension:https://github.com/epwalsh/rust-cached-path/blob/db8cafb061ec1ff561747026f5db4317bfbaff7d/src/archives.rs#L17-L23
The problem that I have is that some archives do not use the expected extension format (in my case, gzipped tarballs are using
.tgz
rather than.tar.gz
). While this could be addressed by expanding/customising the extension list used bycached-path
, perhaps it's also an opportunity to consider some alternative approaches:Content-Type
andContent-Encoding
);file(1)
utility (there's also the magic and bindet crates—the former a wrapper around the libmagic C library and the latter not widely used, but both possibly useful here); orPersonally I feel that HTTP headers would be best (if available: obviously not the case for local resources), perhaps falling-back to magic and/or file extensions if no other option is available.
Happy to submit a PR with whatever approach you feel is most suitable for this library, even if only adding
.tgz
to existing extension list?