nodiscc / hecat

Generic automation tool around data stored as plaintext YAML files
GNU General Public License v3.0
29 stars 5 forks source link

ASH - URL Check Broken #126

Closed Ki-er closed 11 months ago

Ki-er commented 11 months ago

https://github.com/awesome-selfhosted/awesome-selfhosted-data/actions/runs/7093386497/job/19306662014#step:4:2210

ASH link check seems to have died. The maintained project is alive though.

nodiscc commented 11 months ago

Thanks for reporting this.

This is because an invalid URL was supplied in one of the homepage_url attributes:

requests.exceptions.InvalidURL: Invalid URL 'https:/stalw.art': No host supplied

The URL is missing a /. The easiest fix is, of course, to correct it.

This error was not detected early (during checks on the pull request https://github.com/awesome-selfhosted/awesome-selfhosted-data/pull/318) because URL checks are not run on Pull Requests in awesome-selfhosted-data (https://github.com/awesome-selfhosted/awesome-selfhosted-data/blob/master/.github/workflows/pull-request.yml). Adding automatic URL checks on Pull Requests would re-check all URLs in all data files (very long, lots of useless requests, URL check errors on other files will still cause the workflow to fail), because hecat currently lacks the capability to work on a diff/modified files between two branches in a git repository.

I originally reported this in https://github.com/nodiscc/hecat/issues/77, then closed it because it was not apparent what problem it would solve. But now I think we have a prime example.

The url_check module should be able to only check URLs in files that were changed between two branches.

Github actions provides environment variables GITHUB_BASE_REF and GITHUB_HEAD_REF which could be passed to the url_check module (e.g. URL_CHECK_BASE_REF=$GITHUB_BASE_REF HEAD_REF=$GITHUB_HEAD_REF hecat --config .hecat/url-check.yml)

nodiscc commented 11 months ago

https://github.com/nodiscc/hecat/issues/77 reopened to track this possible enhancement.

URL checks on awesome-selfhosted-data fixed in https://github.com/awesome-selfhosted/awesome-selfhosted-data/pull/415

Thanks again

nodiscc commented 11 months ago

See also https://github.com/nodiscc/hecat/issues/127 (the program should just report this InvalidURL as an error, and carry on with other checks)