textlint-rule / textlint-rule-no-dead-link

textlint rule to check if all links are alive.
30 stars 11 forks source link

`checkRelative` is failing to validate relative links properly #110

Open amimas opened 5 years ago

amimas commented 5 years ago

I use static site generator tool to build a static html site. My sources are all in Markdown file/format.

All of these examples below are valid but the textlint rule detects them as an error.

- Ensure that all requirements described in the [Requirements](requirements) page are met
- Ensure that all requirements described in the [Requirements](requirements.html) page are met
- Ensure that all requirements described in the [Requirements](overview#requirements) are met
- Ensure that all requirements described in the [Requirements](overview.html#requirements) are met

In each cases listed above, there would be a source file named requirements.md or overview.md. The static site generator will create a requirements.html, overview.html file once the site is built. However, it seems this textlint rule is not able to perform this validation on the input source which are markdown files.

My only workaround is to set checkRelative to false. I don't really like this option because obviously this means I can only validate external links. But, it's also possible that internal/relative links may be invalid too due to user error.

azu commented 5 years ago

This result is expected behavior. Because this rule can not know the custom tramsform logic.

If you want to resolve the path, We need to add some new feature.

For example, add resolver-extensions option and resolve resolve non-extension path with the option.

{
  "no-dead-link": {
    "resolver-extensions": [".md", ".html"]
  }
}

If the option is above, this rule try to resolve requirements as requirements.md or requirements.html.

[Requirements](requirements)

This approach is similar with ESLint's eslint-plugin-import.

Do you think about it?

amimas commented 5 years ago

I think that's a reasonable suggestion. At least this will allow users to validate "internal" links. Being same approach as ESLint is probably another plus point.

I'm not too sure if this will able to properly validate links within a page (i.e. overview#requirements). We probably can do that by looking for a H# header in the markdown that matches the word requirements (all lower case). If there are dashes, such as: system-requirements in the markdown file, there will be a line as ## System Requirements. But this can get complicated because there are different ways of writing headers in the markdown (i.e. ## System Requirements ##).

What's your thought on this? Some editors (i.e. Visual Studio Code) can directly go to those internal links (when you click on it) including the headers.

azu commented 5 years ago

validate links within a page

I think that this rule will not be covered anchor link winth in a page. Because, these links is not dead link. In other words, that link is not 404.

Currently, this rule just remove #~~~ and check it.

https://github.com/textlint-rule/textlint-rule-no-dead-link/blob/3f9508c2de463c99ac257b9e68b7d20e75e3d8dc/src/no-dead-link.js#L123

I recommend that create another textlint rule for such validation.

amimas commented 5 years ago

I suppose another rule can be created for anchor link validation. I was thinking that this same rule can be enhanced with configuration for validating anchor links.

parthpp commented 5 years ago

@azu Will you be open to accepting a PR, if I implement the anchor link validation within this rule?

azu commented 5 years ago

I implement the anchor link validation within this rule?

Interest, but it is hard that implement right behavior, I think. For example, Markdown format does not support id or name attribute in the syntax. Of course, I know that some markdown format support # header {#id} syntax.

📝Note: GFM create id attribute from header text, but it is not defined in the specification https://github.github.com/gfm/

parthpp commented 5 years ago

📝Note: GFM create id attribute from header text, but it is not defined in the specification https://github.github.com/gfm/

Yes, the header text linking in GFM is implicit and not defined in the specification. But I would argue that large markdown documents require anchor linking. It is practically very difficult to inter link documents without anchors, especially in large documentation base. Hence, even though this is not be part of the specification, I think it still might be one of the most used features of GFM.