raviqqe / muffet

Fast website link checker in Go
MIT License
2.46k stars 95 forks source link

unexpected failure on valid github markdown URL with anchor? #375

Closed elevran closed 2 months ago

elevran commented 3 months ago

Muffet is unable to verify a valid link with an anchor on markdown file served directly from github.com.

Using muffet (via ruzickap/action-my-broken-link-checker@v2) to check website links in GH action. The failure is observed also by by running muffet directly from command line (see below).

Is this format^link valid for muffet? The GitHub servers return an HTML response (e.g., checked output from curl), not text/markdown response, and as can be observed by clicking the link below it is valid, so I would expect it to be reported correctly.

Sample muffet run:

$ muffet --buffer-size=65536 --max-connections=16 --rate-limit=16 --timeout=20 https://clusterlink.net
  https://clusterlink.net/docs/getting-started/users/
        id #clusterlink-crd not found   https://github.com/clusterlink-net/clusterlink/blob/main/design-proposals/project-deployment.md#clusterlink-crd
raviqqe commented 2 months ago

Do you think this is duplicate of #356?

elevran commented 2 months ago

most likely. The returned HTML contains the following:

<script type="application/json" data-target="react-app.embeddedData">{"payload":{"allShortcutsEnabled":false,"fileTree":{"design-proposals":{"items":[...,,{"name":"crd-based-management.md","path":"design-proposals/crd-based-management.md","contentType":"file"},...}}}</script>