tcort / markdown-link-check

checks all of the hyperlinks in a markdown text to determine if they are alive or dead
ISC License
581 stars 118 forks source link

Parse HTML in order to find anchors (like `#tomato`) in the same document #195

Open sschuberth opened 2 years ago

sschuberth commented 2 years ago

markdown-link-check doesn't parse HTML, so it can't find #tomato in your test, but it is something I'm planning on adding.

Originally posted by @tcort in https://github.com/tcort/markdown-link-check/issues/193#issuecomment-1076250098

MikeMcC399 commented 2 years ago

This would be nice to have back again. It used to work in 3.8.7 and no longer works in 3.9.3 or 3.10.3.

Shocktrooper commented 1 year ago

Hello! Was curious about the status of this because anchor links are valid markdown but they are all currently broken when checking if links are valid

MikeMcC399 commented 1 year ago

@Shocktrooper This issue is about anchor links defined in HTML. Is that the same issue that you have?

Steps to reproduce failure

195.md file contains:

- [Open Issues](#issue_list) 
<a name='issue_list'>Open Issues</a>
npm install markdown-link-check@3.8.7
npx markdown-link-check ./195.md

FILE: ./195.md
[✓] #issue_list

1 links checked.

Now update to latest version:

npm install markdown-link-check@3.10.3
npx markdown-link-check ./195.md

FILE: ./195.md
  [✖] #issue_list

  1 links checked.

  ERROR: 1 dead links found!
  [✖] #issue_list → Status: 404

Steps to reproduce correct checking of anchor links

If the anchor link is defined through Markdown (no use of HTML in Markdown file), then the check works:

[link to issues](#issue-list)
# issue list
FILE: ./195-b.md
  [✓] #issue-list

  1 links checked.
Shocktrooper commented 1 year ago

Oh my apologies my issue is plain anchor links in markdown. Not html

MikeMcC399 commented 1 year ago

@Shocktrooper

... my issue is plain anchor links in markdown. Not html

Perhaps you could give an example of a failing test? Also which version of markdown-link-check are you using? Does my example give an error or does it work for you?

[link to issues](#issue-list)
# issue list
MikeMcC399 commented 1 year ago

I am still stuck with using "markdown-link-check": "~3.8.7" due to this regression. Is there any chance of getting it fixed in 3.10.* or later?

Shocktrooper commented 1 year ago

@MikeMcC399 It appears that I had an issue with my syntax that was not apparent and it is not an issue with the tool

srl295 commented 1 year ago

@Shocktrooper would you be able to give an example of what the syntax issue was and how it was fixed? just in case that's what i or others are running into.

Shocktrooper commented 1 year ago

for ## Architecture-Diagram

[Architecture Diagram](#_Architecture_Diagram) was changed to [Architecture Diagram](#architecture-diagram)

and for ## Access and authentication

[Authentication](#_Authentication) was changed to [Authentication](#access-and-authentication)

The first example was an old and valid way that anchor links worked in an older version of markdown and the second example was just a rename someone forgot I believe

jhvhs commented 1 year ago

@MikeMcC399 Sorry to revive an old conversation, but just to be clear, versions 3.8 and prior reported "success" on internal anchor links, but the fact is, it was an unproven positive. Try to change the fragment to any gibberish, and it will pass with flying colours. The solution I see at the moment, until the issue is fixed, is to ignore URLs that consist entirely of fragments. This way they will remain explicitly unchecked, showing up in the report as ignored.

MikeMcC399 commented 1 year ago

@jhvhs No need to apologize for responding to an open issue, however old it is!

Shockingly you are correct!

3.8.7

A bad anchor link is not detected in 3.8.7. All pass.

3.11.0

In version 3.11.0 an anchor link ...

Test set

Tested with:


| Link source                 | Should | 3.8.7    | 3.11.0   |
| --------------------------- | ------ | -------- | -------- |
| [Good regular link](#link1) | Pass   | Pass     | Pass     |
| [Good link in HTML](#link2) | Pass   | Pass     | **Fail** |
| [Bad link](#link3)          | Fail   | **Pass** | Fail     |

**Targets**

# link1

<a name='link2'>Link 2</a>

Edit: Updated for release 3.11.0

adoyle-h commented 1 year ago

You can try https://github.com/lycheeverse/lychee

MikeMcC399 commented 1 year ago

@tcort

Do you have any comments on this thread?

yusufsheiqh commented 1 year ago

Has there been any progress on this issue?

dklimpel commented 5 months ago

I have created a PR to solve the issue.

Rreviews or suggestions for improvement are welcome.