Open raballew opened 1 year ago
Currently tmt {test|plan|story} lint
doesn't check the value of a key. @psss, if we fix this issue, we should valid the value of link
or keys supported by link
if it starts at http
or https
, right?
Yes, we could check all links which start with http
or https
and possibly also check url
key if fmf
identifier is used in the link. I agree that this should be an optional feature because some links can point to internal resources which might not be always available.
pre-commit
hook definitely sounds to me like a neat idea!
We should create one, so users can easily sanity check their repos with tmt
metadata.
Currently Core.lint_keys()
just checks the keys, it doesn't check the value of link
.
689 def lint_keys(self, additional_keys: List[str]) -> List[str]:
690 """ Return list of invalid keys used, empty when all good """
691 known_keys = additional_keys + self._keys()
692 return [key for key in self.node.get().keys() if key not in known_keys]
If we'd like to check the key link
, we can get its value first, e.g.
(Pdb) b tmt/base.py:691
Breakpoint 1 at /home/huanli/.virtualenvs/tmt/lib/python3.10/site-packages/tmt/base.py:691
(Pdb) c
/plans/foo
> /home/huanli/.virtualenvs/tmt/lib/python3.10/site-packages/tmt/base.py(691)lint_keys()
-> known_keys = additional_keys + self._keys()
(Pdb) l
686
687 return data
688
689 def lint_keys(self, additional_keys: List[str]) -> List[str]:
690 """ Return list of invalid keys used, empty when all good """
691 B-> known_keys = additional_keys + self._keys()
692 return [key for key in self.node.get().keys() if key not in known_keys]
693
694 def _lint_summary(self) -> bool:
695 """ Lint summary attribute """
696 # Summary is advised with a resonable length
(Pdb) pp self.node.get().keys()
dict_keys(['discover', 'execute', 'link', 'summary'])
(Pdb) pp self.node.get('link')
'https://github.com/teemtee/tmt/issues/461'
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Hi @raballew, what do you want to check the value (e.g. https://github.com/teemtee/tmt/issues/461
) exactly?
a) Try to access the URL?
b) Validate the URL like How to check whether a string is a valid HTTP URL?
@idorax I have not thought about b) but this might be actually the first step before addressing a). Keep in mind that links are not limited to URLs but more generically to URIs. So a link could reference a file on S3 or an FTP server. My initial thought was to actually trying to access the URI to verify that the content did not become stale.
@raballew, would you please provide some URL examples including both good URLs and stale URLs? Then I'll have a try to access the URLs via code, thanks!
@idorax I am not sure what you mean by this. A "bad" URL is anything that is reachable via HTTP/HTTPS and does not return an status code (4xx, 5xx) whereas a "good" URL return successful status codes (2xx).
Looking at the
tmt
spec, I can see that linking of metadata is a core concept . It is also possible to link to external URIs such as tickets in private issue management systems, files on a storage backend or issues on GitHub. These URIs can become stale over time and it seems logical to me, that there should be a way to check for this kind of dead links. An approach could be to extend thetmt test lint
,tmt plan lint
andtmt story lint
command to actually check the targets as well. Perhaps it makes sense to do the check even before committing changes using a pre-commit hook which could take care of that.It is worth keeping in mind, that some of the URIs can be hidden behind some kind authentication mechanisms, hence this should be an optional feature.