Closed Ki-er closed 11 months ago
Thanks for reporting this.
This is because an invalid URL was supplied in one of the homepage_url
attributes:
requests.exceptions.InvalidURL: Invalid URL 'https:/stalw.art': No host supplied
The URL is missing a /
. The easiest fix is, of course, to correct it.
This error was not detected early (during checks on the pull request https://github.com/awesome-selfhosted/awesome-selfhosted-data/pull/318) because URL checks are not run on Pull Requests in awesome-selfhosted-data (https://github.com/awesome-selfhosted/awesome-selfhosted-data/blob/master/.github/workflows/pull-request.yml). Adding automatic URL checks on Pull Requests would re-check all URLs in all data files (very long, lots of useless requests, URL check errors on other files will still cause the workflow to fail), because hecat currently lacks the capability to work on a diff/modified files between two branches in a git repository.
I originally reported this in https://github.com/nodiscc/hecat/issues/77, then closed it because it was not apparent what problem it would solve. But now I think we have a prime example.
The url_check
module should be able to only check URLs in files that were changed between two branches.
Github actions provides environment variables GITHUB_BASE_REF
and GITHUB_HEAD_REF
which could be passed to the url_check
module (e.g. URL_CHECK_BASE_REF=$GITHUB_BASE_REF HEAD_REF=$GITHUB_HEAD_REF hecat --config .hecat/url-check.yml
)
https://github.com/nodiscc/hecat/issues/77 reopened to track this possible enhancement.
URL checks on awesome-selfhosted-data fixed in https://github.com/awesome-selfhosted/awesome-selfhosted-data/pull/415
Thanks again
See also https://github.com/nodiscc/hecat/issues/127 (the program should just report this InvalidURL
as an error, and carry on with other checks)
ASH link check seems to have died. The maintained project is alive though.