WnP / vimwiki_markdown

vimwiki markdown file to html with syntax highlighting.
MIT License
63 stars 17 forks source link

fix conversion of md links containing anchors to html #23

Closed joaoandreporto closed 1 year ago

joaoandreporto commented 1 year ago

Issue

When converting (from .md in this case) to .html, using WnP/vimwiki_markdown, the hyperlink paths with anchors resolve to e.g.:

file:///path/to/vimwiki/vimwiki_html/index#foo.html

instead of the expected:

file:///path/to/vimwiki/vimwiki_html/index.html#foo

this causes hyperlinks with anchors to not be followed, when navigating the .hmtl files.

Steps to reproduce the issue

On a new wiki...

  1. Create foo.md, with content:
    
    # anchor1

anchor2

anchor3


2. Create `bar.md`, with content:
```markdown
[anchor3](path/to/foo#anchor3)
  1. Run :Vimwiki2HTML to convert the files to .html;

  2. Open bar.html and inspect the [anchor3] hyperlink.

Pull Request

Proposal

This PR offers a solution for vimwiki/vimwiki#1260.

It integrates with the Python markdown library directly, through the Table of Contents extension, which provides access to header ids.

Implementation

Caveats

Using anchor names which mimic html href ids of duplicate anchors is discouraged i.e., those:

Thank you very much for developing WnP/vimwiki_markdown.

WnP commented 1 year ago

Hi @joaoandreporto,

I've just finished the review of you MR. I'll soon create a new release with your changes.

Thanks for your contribution and interest.

joaoandreporto commented 1 year ago

Hi @WnP ,

Thank you for finding it useful.

Yet, meanwhile, I’ve noticed there was still an issue with the conversion of anchors within the same page, links didn’t resolve properly.

I’ve corrected the code and committed the changes onto my fork:

class LinkInlineProcessor(markdown.inlinepatterns.LinkInlineProcessor):
    """Fix wiki links"""

    def getLink(self, *args, **kwargs):
        href, title, index, handled = super().getLink(*args, **kwargs)
        # regex match for anchor hrefs
        # internal anchors
        int_anchor_pattern = r'^#(.+)'
        int_anchor_match = search(int_anchor_pattern, href)
        # external anchors
        ext_anchor_pattern = r'(.+)#(.+)'
        ext_anchor_match = search(ext_anchor_pattern, href)
        if not href.startswith("http") and not href.endswith(".html"):
            # index md to html link
            if auto_index and href.endswith("/"):
                href += "index.html"
            # internal anchor md to html link
            elif int_anchor_match:
                anchor = markdown.extensions.toc.slugify(
                                                    int_anchor_match.group(0),
                                                    "-")
                href = "#" + anchor
            # external anchor md to html link
            elif ext_anchor_match:
                hlnk = ext_anchor_match.group(1)
                # slugify md anchors to make them match href ids
                anchor = markdown.extensions.toc.slugify(
                                                    ext_anchor_match.group(2),
                                                    "-")
                href = hlnk + ".html#" + anchor
            # no anchor md to html link
            elif not href.endswith("/"):
                href += ".html"
        return href, title, index, handled

I don't know if reverting this pull request, or opening a new one is a good solution. Do you have any suggestion?

WnP commented 1 year ago

This MR has been reverted in favor of #27

Feel free to comment there if it doesn't fit your needs.