Munter / hyperlink

A node library and command line tool to test the integrity of your internal an external hyperlinks
231 stars 24 forks source link

Document dependency on 'file'/'libmagic1'? #197

Closed danielfdickinson closed 2 years ago

danielfdickinson commented 2 years ago

I'm making this note here because the only node tool I am using where this has been an issue is 'hyperlink', but the actual bug might belong to another repo; I'm hoping for some help figuring out where it belongs if not here.

Using a minimal Debian Bullseye container with Go from upstream, and Node installed via NVM (lts/gallium => 16.13.1) and using hyperlink on a site I get error messages along the lines of (ENOENT spawn failed) for .pdf, .tar.gz, .tgz, .patch, .diff, etc (basically not HTML / CSS /JS gets ENOENT) when checking links (internal or external).

The solution is to install the 'file' package (which pulls in 'libmagic1' which may be the only actual dependency) using apt. Then internal links all succeed and external checks are currently clean (though obviously that changes as external sites change).

What I'm not sure about is whether this is a 'normal' (and documented) Node dependency that the container (created using a Dockerfile from https://github.com/microsoft/vscode-dev-containers which uses their Bullseye Go image and adds Node via NVM) omits (in which case the bug goes against that repo), or whether is is something NVM should document/handle (in which case the bug goes there), or if this is peculiar to 'hyperlink', in which case it needs to be documented here.

Do you happen to know which is the case? In any event, since it shows up with 'hyperlink' and not other node tools I used, it wouldn't hurt (IMO) to document in the README here.

papandreou commented 2 years ago

Hmm, good shout. That probably comes in due to https://github.com/assetgraph/assetgraph/blob/master/lib/util/determineFileType.js, which is used to detect asset types when no Content-Type is available and the file extension doesn't provide any hints either. I wonder if we should just pull that out of assetgraph.

danielfdickinson commented 2 years ago

I think yes, removing from assetgraph makes sense, for three reasons. One is that the file extensions like .bz2, .tar.gz, .tgz, .patch, .diff, and .pdf should all have used the file extension logic and not 'file', so it looks like either the list of file extensions is missing a lot, or the logic that does file extensions with a fallback to 'file' is not quite right. The second reason is that I don't think the fallback should be used all that often. And third is that I think using 'file' via a subshell is kind of squirrelly for a NodeJS library and probably breaks on Windows (or is omitted which would indicate it's not really needed). Do you want me to open an issue on assetgraph with a pointer to this issue?

papandreou commented 2 years ago

I'll get it removed for the next major version: https://github.com/assetgraph/assetgraph/pull/1220

I love the word "squirrelly", haha :shipit:

Your profile picture looks like the trumpeter Randy Brecker, by the way!

danielfdickinson commented 2 years ago

Sounds good. Thank you. I'll have remember that emoticon too, and the comment about Randy Brecker is quite a compliment in my view 😉 Thanks again. It's always a pleasure working with you, especially so far it's when I've pointing out something that needs fixing. (I appreciate the tools you build by the way. Hyperlink is by far and away the best tool for the checking links that I've found).

danielfdickinson commented 2 years ago

Closing as it's been updated in assetgraph.