raviqqe / muffet

Fast website link checker in Go
MIT License
2.49k stars 96 forks source link

Anchors / fragments not rendered by JavaScript not found #254

Closed slonka closed 1 year ago

slonka commented 1 year ago

Hi :)

I know there has been a couple of issues where anchors that are manipulated by JavaScript are not found (and I know why that is and that there is nothing that could be done) but I've recently stumbled upon a website which has an anchor that is not manipulated by javascript and it's still reported as "not found".

Steps to reproduce:

$ ~/bin/muffet --version
2.6.1

$ ~/bin/muffet --one-page-only https://deploy-preview-1034--kuma.netlify.app/docs/1.8.x/explore/gateway-api/
https://deploy-preview-1034--kuma.netlify.app/docs/1.8.x/explore/gateway-api/
        error when reading response headers: small read buffer. Increase ReadBufferSize. Buffer size=4096, contents: "HTTP/1.1 200 OK\r\nDate: Mon, 17 Oct 2022 12:37:02 GMT\r\nPerf: 7626143928\r\nExpiry: Tue, 31 Mar 1981 05:00:00 GMT\r\nPragma: no-cache\r\nServer: tsa_o\r\nSet-Cookie: guest_id=v1%3A166601022209834356; Max-Age=34"..."rt?a=O5RXE%3D%3D%3D&ro=false\r\nStrict-Transport-Security: max-age=631138519\r\nCross-Origin-Opener-Policy: same-origin-allow-popups\r\nCross-Origin-Embedder-Policy: unsafe-none\r\nX-Response-Time: 135\r\nx-con"        https://twitter.com/KumaMesh
        id #install-experimental-channel not found      https://gateway-api.sigs.k8s.io/guides/getting-started/#install-experimental-channel

Not sure if the twitter one is relevant, it does not show up on our CI.

Link to CI logs.

And proof that in chrome with disabled javascript the anchor is there.

image

Please let me know if this is something that could be fixed.

raviqqe commented 1 year ago

Hi, sorry for the late reply. Are you still experiencing the bug?

I actually doesn't get the error but only from the Twitter URL somehow.

> muffet --one-page-only https://deploy-preview-1034--kuma.netlify.app/docs/1.8.x/explore/gateway-api/
https://deploy-preview-1034--kuma.netlify.app/docs/1.8.x/explore/gateway-api/
        error when reading response headers: small read buffer. Increase ReadBufferSize. Buffer size=4096, contents: "HTTP/1.1 200 OK\r\nDate: Sun, 08 Jan 2023 02:58:04 GMT\r\nPerf: 7626143928\r\nExpiry: Tue, 31 Mar 1981 05:00:00 GMT\r\nPragma: no-cache\r\nServer: tsa_m\r\nSet-Cookie: guest_id_marketing=v1%3A167314668483110399; "..."om/recaptcha/ https://www.gstatic.com/recaptcha/ https://client-api.arkoselabs.com/ https://www.google-analytics.com https://twitter.com https://app.link https://accounts.google.com/gsi/client https:/"        https://twitter.com/KumaMesh
raviqqe commented 1 year ago

Closing for inactivity. Feel free to re-open it if you still have issues.