executablebooks / MyST-Parser

An extended commonmark compliant parser, with bridges to docutils/sphinx
https://myst-parser.readthedocs.io
MIT License
752 stars 197 forks source link

MyST does not link to anchors within non-auto-generated markdown or HTML files #564

Open prescod opened 2 years ago

prescod commented 2 years ago

Describe the bug

context When I do this:


Works: [b](c.html)
Does not work: [b](c.html#foo)

expectation I expected two links.

bug But instead only one link is created.

$ sphinx-build docs out

...
/private/tmp/mysttest/docs/index.md:3: WARNING: 'myst' reference target not found: c.html#foo
...

problem This is a problem for people who need to link to anchors inside of HTML files.

Reproduce the bug

 CCI-39 ▶ tmp ❯ mysttest ▶ 12 ▶ % ▶ cat docs/index.md                 
# AAAAA

Works: [b](c.html)

Does not work: [b2](c.html#foo)
 CCI-39 ▶ tmp ❯ mysttest ▶ 12 ▶ % ▶ cat docs/conf.py                  
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = [
    "sphinx.ext.autodoc",
    "sphinx.ext.viewcode",
    "sphinx.ext.autosectionlabel",
    "myst_parser"  # ,
    #    "recommonmark"
]
 CCI-39 ▶ tmp ❯ mysttest ▶ 12 ▶ % ▶ cat docs/c.html                   
<a id="foo">Foo!</a>
 CCI-39 ▶ tmp ❯ mysttest ▶ 12 ▶ % ▶ rm -rf out ; sphinx-build docs out
Running Sphinx v4.5.0
making output directory... done
myst v0.17.1: MdParserConfig(commonmark_only=False, gfm_only=False, enable_extensions=[], linkify_fuzzy_links=True, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, dmath_double_inline=False, update_mathjax=True, mathjax_classes='tex2jax_process|mathjax_process|math|output_area', disable_syntax=[], all_links_external=False, url_schemes=['http', 'https', 'mailto', 'ftp'], ref_domains=None, highlight_code_blocks=True, number_code_blocks=[], title_to_header=False, heading_anchors=None, heading_slug_func=None, html_meta=[], footnote_transition=True, substitutions=[], sub_delimiters=['{', '}'], words_per_minute=200)
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: [new config] 1 added, 0 changed, 0 removed
reading sources... [100%] index                                                                                      
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] index                                                                                       
/private/tmp/mysttest/docs/index.md:5: WARNING: 'myst' reference target not found: c.html#foo
generating indices... genindex done
writing additional pages... search done
copying downloadable files... [100%] c.html                                                                          
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded, 1 warning.

The HTML pages are in out.

List your environment

sphinx-build --version            
sphinx-build 4.5.0
welcome[bot] commented 2 years ago

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

afeld commented 2 years ago

This also seems to come up for links to Markdown anchors, e.g.

[some other page](page.md#something)
jpfeuffer commented 2 years ago

Same issue here. What is happening? This is clearly documented to work.

jpfeuffer commented 2 years ago

Note that we of course also activate myst_heading_anchors = 3

choldgraf commented 2 years ago

I believe that the anchors linking only works for header anchors that are auto-generated. If you link to a header that wasn't auto-generated then MyST won't find it. At least that was the behavior that @nthiery and I ran into when we tried to reproduce this (he mentioned the same problem)

Perhaps it would be possible for this functionality to check whether a # is present in a markdown link, and if so, then somehow:

Not sure how feasible that is though

chrisjsewell commented 2 years ago

I think there is some confusion is some of the later comments, conflating the use of .html and .md extensions: myst_heading_anchors only work for .md extensions, i.e. [text](page.md#something) will link to a heading # something (or # Something, etc) on page.md, providing at least myst_heading_anchors=1

For:

Works: [b](c.html)
Does not work: [b](c.html#foo)

linking to built html documents is not recommended, since it in essence goes against the output format agnostic nature of sphinx, e.g. it would not work if you tried to build a LaTeX PDF etc

myst-parser provides a number of ways to make the links output format agnostic: https://myst-parser.readthedocs.io/en/latest/syntax/syntax.html#markdown-links-and-referencing

In this case, one should ideally be linking to the actual Markdown documents, i.e. using the .md format:

[b](c.md)
[b](c.md#foo)

If you specifically only want markdown style links to output "external hrefs", then you can use:

myst_all_links_external = True

this will simply output every text link as e.g. text`, without any "smart" referencing

gao-hongnan commented 2 years ago

Hi there, for jupyter-book, whatshould I input into _config.yml to achieve this auto header generation?

Currently I put this in but it does not work:

parse:
  myst_url_schemes: [mailto, http, https]
  myst_heading_anchors: 3
lefuturiste commented 2 years ago

Same for me, I cannot use markdown link with anchor to link to a generated <section id="xxx"> tag of another page.

bheberlein commented 1 year ago

Also does not work to do this:

This is a [reference to an anchor below](#anchor-below).

<a name="anchor-below">[1]</a> This is an anchor.

But it works on GitHub (as well as other markdown renderers that support HTML).

[1] This is an anchor.

chrisjsewell commented 1 year ago

It works with https://myst-parser--717.org.readthedocs.build/en/717/syntax/cross-referencing.html#default-destination-resolution 😉

You'll get a warning, but then will still generate the link

WARNING: 'myst' cross-reference target not found: 'anchor-below' [myst.xref_missing]

warning can be handled: https://myst-parser--717.org.readthedocs.build/en/717/syntax/cross-referencing.html#handling-invalid-references

chrisjsewell commented 1 year ago

With https://myst-parser--717.org.readthedocs.build/en/717/syntax/cross-referencing.html#customising-external-url-resolution, you can also specifically denote a link as "external":

This is a [reference to an anchor below](#anchor-below){.external}.
jonas-w commented 1 year ago

It works with https://myst-parser--717.org.readthedocs.build/en/717/syntax/cross-referencing.html#default-destination-resolution 😉

You'll get a warning, but then will still generate the link

WARNING: 'myst' cross-reference target not found: 'anchor-below' [myst.xref_missing]

warning can be handled: https://myst-parser--717.org.readthedocs.build/en/717/syntax/cross-referencing.html#handling-invalid-references

why does it create a warning even though it works?

nodiscc commented 11 months ago

I can confirm that the warning is still there on 2.0.0, despite the link being properly generated.

I don't want to add suppress_warnings = ['myst.xref_missing'] to my configuration as it would hide real, actual broken cross-references.

Can this be fixed so that the warning is not shown for valid cross-references? Thanks