remarkjs / remark-validate-links

plugin to check that markdown links and images reference existing files and headings
https://remark.js.org
MIT License
111 stars 26 forks source link

Feature request: Ignore warnings for specific files #18

Open dialex opened 6 years ago

dialex commented 6 years ago

Like we can do for linting warnings/errors, can we ignore the warnings for a specific file?

wooorm commented 6 years ago

Why would you not want documents to be checked?

dialex commented 6 years ago

It was a compatibility issue with docsify. Nevermind, I did a workaround.

Noah-Silvera commented 4 years ago

Our team has a use case for this feature - we dynamically generate a small portion of our markdown content, and want to link to it using markdown links from elsewhere in our documentation. However, all links to the dynamically generated markdown fail when remark-validate-links is run, even though they work at runtime. We would like to be able to ignore this subset of links.

wooorm commented 4 years ago

@Noah-Silvera Could you clarify how you generate that markdown content? How are you using this plugin, and remark in general?

Noah-Silvera commented 4 years ago

Hi @wooorm. Sure thing!

We are using this plugin with Docusaurus to validate the links in our documentation content, which can be found here. Most of the content in the dev center is static, for example, this page. However, we also generate some API Reference Documentation dynamically as markdown content. An example of this can be found here.

We want to be able to link out to this dynamic API markdown content. However, since the content is not available statically, the remark-validate-links task will always fail, even if the links are valid at runtime.

We'd like to be able to ignore these links we know will be valid at runtime.

wooorm commented 4 years ago

This plugin is specifically for Markdown references, e.g., from docs in a GitHub project, to other docs in that same GitHub project. It doesn’t check whether your links work on your website, it checks whether they work on GitHub/BitBucket/GitLab.

Would https://github.com/davidtheclark/remark-lint-no-dead-urls perhaps be of interest?

Noah-Silvera commented 4 years ago

@wooorm but the links we are trying to use are markdown references from some docs to other docs within the same github project. They are not external links, so remark-lint-no-dead-urls does not looks like the tooling for us.

Here's an example.

On this page, various "commands and operations" are discussed, like auth.sign-in or edit.delete-features. The API reference we want to link those commands and operations too exists as generated markdown documentation, on the same site. Here's the markdown header for auth.sign-in.

We want to be able to use local relative markdown links such as

Such as the [`auth.sign-in`](api-commands-operations-events#command-auth.sign-in) command...

throughout our documentation.

However, since the api-commands-operations-events docs are generated markdown, they are not available at lint time. This means that even though we are writing a valid markdown reference link to another piece of documentation on the same site, the remark-validate-links plugin is marking it as broken because the target documentation does not exist until runtime. We want to be able to ignore links of these nature so we can link to our dynamic markdown documentation without it breaking the linting.

wooorm commented 4 years ago

If you would write:

Such as the [`auth.sign-in`](api-commands-operations-events#command-auth.sign-in) command...

…in docs/web/configuration-commands-operations.mdx, and this file was rendered on GitHub, then when someone clicks on that link, it will result in a 404. That is what this plugin does:

We want to be able to ignore links of these nature so we can link to our dynamic markdown documentation without it breaking the linting.

How would you propose this to be solved? As we are at lint-time, this would mean a long list of values that will be generated later:

[
  'api-commands-operations-events.mdx/#command-auth.sign-in',
  'api-commands-operations-events.mdx/#command-auth.sign-out',
  // …
]

But now you have a new problem: that list needs to be kept up to date, because if for some reason it’s now sign-up on the website, you won’t get any errors at lint time.

Hence, I don‘t think this can be properly solved at lint-time, because there isn’t any information about run-time. Maybe, it makes more sense to instead lint (with rehype?) the generated HTML instead?

Noah-Silvera commented 4 years ago

Hey, thanks for your response, I appreciate your time.

In regards to the [auth.sign-in](api-commands-operations-events#command-auth.sign-in) link...

  • it errors because there is no file called api-commands-operations-events (GH shows a 404)

There actually is a file called api-commands-operations-events so the link does not 404

  • it errors because there is no command-auth.sign-in heading in the MDX file (GH doesn’t scroll to a heading but remains at the top).

It does error because the heading only exists at runtime - remark-validate-links attempts to find the heading, and it fails to find it.

I put up a demonstration PR in our project that shows this issue. Near the top of this page, I added two links to the api-commands-operations-events docs; auth.sign-in and auth.sign-out. These links are both valid at runtime, and if you click them, will take you to the right markdown header on the api-commands-operations-events page.

However, if you take a look at the build for the PR, you can see that remark-validate-links is failing to find the headings, and erroring out.

2020-06-16 09_38_19-Window

I absolutely agree that this problem cannot be properly solved at lint time. We are dynamically generating markdown headings, and links to these headings should be checked on the live site. We do have plans to use an external link checker to verify links of this nature - however, we haven't added it yet, and the best candidate we are looking to use currently can't check headings.

However, even if we were to also lint the generated HTML with this external link checker and verify the links properly, we still would have the issue of remark-validate-links failing at lint time. We still would need to be able to ignore those links at lint time and then check them later at run time.

Looking at other projects which lint on various criteria, a common pattern to ignoring is to provide a list of optional regex expressions.

For example, the CSpell package provides an ignoreRegExpList in it's config, which allows you to specify multiple regular expressions which will be ignored in checks.

If you allows for regex, the issue of keeping the list up to date gets a bit simpler. For example, in our case, the regex would be api-commands-operations-events.mdx#[a-zA-Z-\.]+ which would match any links to the generated headings, regardless of if they change.

Thanks for discussing this with me! This package has been very useful for our project so far.

wooorm commented 4 years ago

Can you use links to the website?

-   [`auth.sign-in`](https://developers.geocortex.com/docs/web/api-commands-operations-events/#command-auth.sign-in)
-   [`auth.sign-out`](https://developers.geocortex.com/docs/web/api-commands-operations-events/#command-auth.sign-out)
-   `edit.add-feature`
-   `edit.delete-features`
Noah-Silvera commented 4 years ago

Can you use links to the website?

-   [`auth.sign-in`](https://developers.geocortex.com/docs/web/api-commands-operations-events/#command-auth.sign-in)
-   [`auth.sign-out`](https://developers.geocortex.com/docs/web/api-commands-operations-events/#command-auth.sign-out)
-   `edit.add-feature`
-   `edit.delete-features`

In order for it to work locally, in staging, and prod we would need to dynamically generate the base url. In our documentation system, that's done through the useBaseUrl() function. However, we can't call that function in the markdown link.

[`auth.sign-in`](useBaseUrl("/docs/web/api-commands-operations-events/#command-auth.sign-in"))

We would need to add anchors mixed in with the markdown, which we'd prefer not to do.

<a href={useBaseUrl("/docs/web/api-commands-operations-events/#command-auth.sign-in")}>`auth.sign-in`</a>

(edited by @wooorm to fix syntax)

wooorm commented 4 years ago

Would comments work?

<!--validate-links disable-->

-   [`auth.sign-in`](broken)
-   [`auth.sign-out`](broken)
-   `edit.add-feature`
-   `edit.delete-features`

<!--validate-links enable-->
Noah-Silvera commented 4 years ago

Would comments work?

<!--validate-links disable-->

-   [`auth.sign-in`](broken)
-   [`auth.sign-out`](broken)
-   `edit.add-feature`
-   `edit.delete-features`

<!--validate-links enable-->

For our use case, we could possibly make it work, but it's not as ideal as a regex ignore list. For example, these broken-at-lint-time links could be mixed in with other links that we actually want validated.

For example, here's a snippet from this page. I've added line breaks to make it easier to read, but it was originally all one paragraph.

The second behavior in this application is a [`map.zoom-to-initial-viewpoint`](api-commands-operations-events.mdx#command-map.zoom-to-initial-viewpoint) command on the I Want To Menu. 
This command takes `Maps` type argument. 
The `argument` property in the app config supplies an array of maps 
by referencing the `default` map with an [**Item URI**](configuration-app-config-reference.mdx#item-uris). 
Item URIs are a way of referencing other items within the app config.

The first link to map.zoom-to-initial-viewpoint points to generated markdown, and the second link, "Item URI" points to static content available at lint time. If we used the <!--validate-links disable--> pattern here, we would lose the validation of the "Item URI" link.

I'm sure the comment pattern could be more valuable to others use cases, as it is localized to a very specific link instead of a generic pattern, but the regex ignore list would be more useful to our use case.

travi commented 4 years ago

i've run into a case that i think falls into this category. this link to create a new file is a valid link in the context of github, but does not pass link validation. if there isn't a better way to make this pass validation, i'd love a way to at least ignore it. the comments mentioned above would be acceptable in my case.

CraigMacomber commented 3 years ago

A project I work on has this issue as well: we would like to link our generated API documentation from our hand written documentation. Currently our options are:

Disable comments would work in our case, so would a regex skip validating links that match the specific pattern, or even just a list of urls/paths to consider valid without actually checking.

xiaogaozi commented 4 months ago

We encountered a similar problem. Because we used the Markdown importing feature of Docusaurus to import other Markdown files into a document, such as importing bar.mdx in foo.mdx, and there are many Markdown headings in bar.mdx. At this time, using something like [...](foo.mdx#some-heading-in-bar) will cause an error, because the heading #some-heading-in-bar only exists in bar.mdx file.

As @Noah-Silvera suggested, it would be very useful for us to ignore checking some links through regular expressions. Of course, it would be better if the heading of the link could be detected in another Markdown file.

silvenon commented 4 months ago

It might be possible to build a wrapper plugin around remark-validate-links that accepts a glob option:

// plugins/remark-validate-links-with-glob.js
import remarkValidateLinks from 'remark-validate-links'
import pm from 'picomatch' // use whichever glob matching library suits you best
import path from 'node:path'

export default function remarkValidateLinksWithGlob(options, fileSet) {
  const originalTransform = remarkValidateLinks(options, fileSet)
  const isMatch = pm(options.glob)
  return (tree, file) => {
    if (isMatch(path.relative(process.cwd(), file.path))) {
      return originalTransform()
    }
  }
}

Then use it like so:

// vite.config.ts
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import mdx from '@mdx-js/rollup'
import remarkValidateLinksWithGlob from './plugins/remark-validate-links-with-glob'

export default defineConfig({
  plugins: [
    react(),
    mdx({
      remarkPlugins: [
        [remarkValidateLinksWithGlob, { glob: ["posts/**/*.mdx"] }],
      ],
    }),
  ],
})

I haven't tried it, but a variation of this could work.

I was thinking of what would be good to propose to the plugin itself, but I can't think of a good API because this is linting territory, so comment-based error suppression IMO shouldn't be performed on a plugin-by-plugin basis.

One day ESLint might be able to lint other languages, maybe even Markdown and MDX, and it already has a way to suppress warnings and errors… but that seems ages away 😅