urlstechie / urlchecker-action

:octocat: :link: GitHub action to extract and check urls in code and documentations.
https://urlchecker-python.readthedocs.io
MIT License
34 stars 12 forks source link

Is it possible to run locally for debugging purposes? #110

Closed billbrod closed 4 months ago

billbrod commented 4 months ago

This is a very useful action, thanks for your work on it! I noticed recently that it's missing some dead links in my documentation (e.g., the word Metamer right under this header, source). It's an issue with some nblink files from nbsphinx-link, which aren't getting converted to the proper html link by sphinx, which I can debug, but I'm not sure why the action isn't finding it (the action has previously found issues in my documentation, so I don't think it's an issue with my config).

Is it possible to run the action locally, so I don't have to go through github actions in order to try and figure out what's going on? I'm not familiar enough with how github actions are written to know if this is straightforward or not

billbrod commented 4 months ago

(my issue was a simple path issue, since I had changed my directory structure, but I'm still unsure why urlchecker didn't flag this for me)

billbrod commented 4 months ago

To clarify, one of the links it should've flagged is https://plenoptic.readthedocs.io/en/latest/tutorials/06_Metamer.nblink, which leads to a readthedocs 404 page.

vsoch commented 4 months ago

Is it possible to run the action locally, so I don't have to go through github actions in order to try and figure out what's going on? I'm not familiar enough with how github actions are written to know if this is straightforward or not

Yes absolutely, it's the tool here: https://github.com/urlstechie/urlchecker-python. If you can find the issue there we can easily fix it here.

billbrod commented 4 months ago

Excellent, thanks! Do you want me to keep posting here if I find the issue, or open a new issue in the other repo?

vsoch commented 4 months ago

This issue is OK, we might as well keep the discussion in one spot. If you like, I can transfer the issue there - just let me know.

billbrod commented 4 months ago

Doesn't matter to me, happy to keep it here!

I'm guessing this has to do with the url regex that gets used? The relevant link that should be failing is <a class="reference external" href="tutorials/06_Metamer.nblink">, whereas if it were correct, sphinx would convert it to <a class="reference internal" href="tutorials/06_Metamer.html">. In either case, the link is pointing to a file in the sphinx build directory and the question is whether it exists or not.

This seems like it might not be in urlchecker's scope then -- it's not a url and internet issue but a "my sphinx build is behaving unexpectedly" issue. Does that seem right?

For my issue, it seems like I can hopefully configure sphinx to raise an error in this case.

vsoch commented 4 months ago

This seems like it might not be in urlchecker's scope then -- it's not a url and internet issue but a "my sphinx build is behaving unexpectedly" issue. Does that seem right?

That is correct - we explicitly look for a rendered URL (http/https) that we can then check. We don't parse a relative path for static files that are local.

billbrod commented 4 months ago

Makes sense! Thanks for your help and I'll see if I can find another way to get sphinx to raise an error here.

vsoch commented 4 months ago

A post-deploy suggestion (and one that might actually make sense if you don't deploy frequently, but a link can still go 404) is to have the check done after a download of the html for the live site. If you can find a tool that would download your rendered pages (or some subset) you could run a static checker in CI.