Closed dcastro closed 2 years ago
For references that point to images, we will need special care. The current approach to verifying links (load page with HEAD
method, then with GET
) may turn out to be very heavyweight when it comes to resources.
For references that point to images, we will need special care. The current approach to verifying links (load page with
HEAD
method, then withGET
) may turn out to be very heavyweight when it comes to resources.
Oh good point. We can hope that the server will send a chunked response, in which case we only need to wait for the first chunk to inspect the status code (and not look at the response's body at all).
We should probably also have a config/CLI option to disable checking images, in case the user's project happens to have a lot of links to heavy images.
Thinking about this a bit more - maybe they don't need special treatment (nor a flag to disable this check)?
I mean, from our point of view, an image link like ![alt text](https://localhost/image.png)
is the exact same thing as a regular link like [alt text](https://localhost/image.png)
, and we don't treat those as special. In fact, a user can currently use a regular link to point to a 1GB binary file.
Since we're on this topic, I also tried this:
[link](https://ftp.rnl.tecnico.ulisboa.pt/pub/ubuntu/releases/22.04.1/ubuntu-22.04.1-desktop-amd64.iso)
And xrefcheck returns almost immediately.
$ time xrefcheck
Configuration file not found, using default config for GitHub repositories
All repository links are valid.
xrefcheck 0.84s user 0.42s system 216% cpu 0.583 total
Oh really. I initially has an impression, that performing a request usually results in getting the entire response, but in fact most libraries provide a response as a stream, not the entire response at once. And req
even allows discarding the response without fetching it, which we in fact currently do.
Nice observation.
Maybe we don't even need to try the request with HEAD
method first then, that could be an oversight from my side too. The only purpose of this extra action was to avoid reading potentially large response bodies, but this also comes with obvious drawbacks.
Clarification and motivation
We should add support for image links.
Acceptance criteria
xrefcheck
verifies whether image links are valid.