Closed Parent5446 closed 4 years ago
I have not read this in detail, but we get the ToC from https://github.com/russross/blackfriday -- so maybe it is better to discuss it there.
This messes with the page semantically since now the navigation has an empty top-level. The way I see it there are two ways to fix this: [...]
I am affected by this odd behaviour too. The way .ToC
is rendered by Hugo/Blackfriday makes the entire concept of ToC pretty much useless. In the worst case scenario, my ToC is completely messed up and doesn't reflect any more the original header structure in my markdown files. Some times, the ToC's headers are so messed up they aren't correctly rendered by the browser.
The issue here is that blackfriday spits out a bunch of hard coded HTML tags, and then Hugo wrap them in a way that is neither valid HTML code.
A solution is quite simple: don't give users a preformatted .ToC
, just give them an indexed array and then let them generate the desired HTML structure by iterating over the array elements.
@Parent5446 and @Dr-Terrible, It would be great if one of you would create a blackfriday issue for this.
The bottom line of this is:
The ToC should not be HTML, it should be a datastructure that people can do with as they please.
Okay, for some reason, even though toc_levels
explicitly specified to 1..6 both in page and in config file, kramdown does not generate ids.
I'm putting this on hold for the time being for I'd prefer to test it with a newer jekyll version (and dependencies), which would take some more time.
I found that the level4 subsections are not that large, so we could get without links there. What do you think, guys?
Cheers, Erjan
I've written a little tool that removes the unnecessary level of nesting from the table of contents. You can run it, after Hugo has generated the contents in the "public" folder.
I'm running into an issue that may be related. It seems that when I have a lower level tag i.e. an H6 appearing first in my content before an H5, a TOC is rendered at the start of my content, without my explicitly including a .TableOfContents tag.
What I mean is, if my content looks like:
##### This is an H5 #####
###### This is an H6 ######
Then all is right with the world, and no TOC gets generated. But if it looks like this:
###### This is an H6 ######
##### This is an H5 #####
Then I get the garbled TOC HTML appearing arbitrarily at the start of my content. It doesn't even have the "TableOfContents" ID; it's just a naked NAV tag. Seems like a bug...
This issue has been automatically marked as stale because it has not been commented on for at least four months.
The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master
branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in four months if no further activity occurs. Thank you for all your contributions.
Note/Update: This issue is marked as stale, and I may have said something earlier about "opening a thread on the discussion forum". Please don't.
If this is a bug and you can still reproduce this error on the latest release
or the master
branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue is still unresolved.
Any news on this? It appears to be still unresolved.
I found this open issue when trying to figure out why Hugo's {{ .TableOfContents }}
didn't seem to work properly / wasn't styleable. I created a partial for generating TOC trees based on header tags. The general philosophy being that TOCs need to be customisable so I figured that'd work best with a partial. This is more of a way to render headers within a .Content
block rather than a formal data structure.
snippet from partials/table-of-contents.html
:
<!-- ignore empty links with + -->
{{ $headers := findRE "<h[1-6].*?>(.|\n])+?</h[1-6]>" .Content }}
<!-- at least one header to link to -->
{{ $has_headers := ge (len $headers) 1 }}
<!-- a post can explicitly disable Table of Contents with toc: false -->
{{ $show_toc := (eq $.Params.toc true) }}
{{ if and $has_headers $show_toc }}
<div class="table-of-contents toc bd-callout">
<!-- TOC header -->
<h4 class="text-muted">Table of Contents</h4>
{{ range $headers }}
{{ $header := . }}
{{ range first 1 (findRE "<h[1-6]" $header 1) }}
{{ range findRE "[1-6]" . 1 }}
{{ $next_heading := (int .) }}
<!-- generate li array of the proper depth -->
{{ range seq $next_heading }}
<ul class="toc-h{{ . }}">
{{end}}
{{ $base := ($.Page.File.LogicalName) }}
{{ $anchorId := ($header | plainify | htmlEscape | urlize) }}
{{ $href := delimit (slice $base $anchorId) "#" | string }}
<a href="{{ relref $.Page $href }}">
<li>{{ $header | plainify | htmlEscape }}</li>
</a>
<!-- close list -->
{{ range seq $next_heading }}
</ul>
{{end}}
{{end}}
{{end}}
{{ end }}
</div>
{{ end }}
Here is how I have it working to render TOCs for Posts:
<div class="content">
{{ partial "banner" . }}
{{ partial "table-of-contents" . }}
<!-- supports emoji -->
{{ .Content | emojify }}
</div>
@mikeblum thank you for this snippet! I used it and made bootstrap-styled table of contents, my snippet is here
Once we have #1778, we can more easily provide the TOC as a data structure, using access to the syntax tree, or perhaps writing a new renderer to build during parsing.
Once we have #1778
I meant to say, once we have #3949 (Upgrade to Blackfriday v2)...
@mikeblum thank you!
@mikeblum thanks a lot!
I still have an issue with the anchor link. This ($header | plainify | htmlEscape | urlize)
does not work with several cases. Examples:
Bonjour, ca va ?
shoud return bonjour-ca-va
but returns bonjour-ca-va-
(note the hyphen at the end)
Also, it does not work with apostrophes. Both href and title do not work. for example let's go
gives letamprsquos-go
I am not a go expert so I cannot help.
@alexislg2 The last two functions are incorrect. plainify
returns a string that is already escaped, so we have to htmlUnescape
it. Furthermore, anchors are generated using the anchorize
function (a BlackFriday provided feature), not urlize
.
Here are the relevant changes to get the partial @mikeblum posted working correctly:
{{ $anchorId := ($header | plainify | htmlUnescape | anchorize) }}
{{ $href := delimit (slice $base $anchorId) "#" | string }}
<li><a href="{{ relref $.Page $href }}">
{{ $header | plainify | htmlUnescape }}
</a></li>
I implemented the toc as a partial using code from above, but the logic of the code produced markup that was invalid and not semantically sound. So I rewrote it like so:
https://gist.github.com/skyzyx/a796d66f6a124f057f3374eff0b3f99a
This version intentionally only looks for h2
…h4
. This is because the page title is the h1
, and everything else is h2
or below. I also choose to stop at h4
because the value to the reader beyond that is — in my experience — negligible.
Feel free to re-adjust the regexes if you want a broader spectrum of headers.
In case any one is interested, I just wrote a short JS script to remove the non-existent h1
in TOC so that it can start from h2
instead: https://github.com/yihui/misc.js/blob/main/js/fix-toc.js One advantage of this solution is that it does not assume whether your TOC starts from h1 or h2.
You can include the script via something like <script src="/js/fix-toc.js></script>
after you put it under the static/js/
directory of your site.
Here is an example. If your eyes are quick enough, you can actually see the first <ul>
in TOC quickly removed :)
@skyzyx Thanks for sharing. I'm implement this in my Hugo blog on GitLab under /pages/*/index.md
, but this generates an unordered list of links pointing to /post/*/index.md
. I believe I'll probably end up with errors similar to those in my recent failed job.
@yihui Thanks for your JavaScript, even though it works only if there's more than one section. I've adapted your script to Beautiful Hugo and published it on GitLab snippet.
// Copyright (c) 2017 Yihui Xie & 2018 Vincent Tam under MIT
(function() {
var toc = document.getElementById('TableOfContents');
if (!toc) return;
do {
var li, ul = toc.querySelector('ul');
if (ul.childElementCount !== 1) break;
li = ul.firstElementChild;
if (li.tagName !== 'LI') break;
// remove <ul><li></li></ul> where only <ul> only contains one <li>
ul.outerHTML = li.innerHTML;
} while (toc.childElementCount >= 1);
})();
I implemented the toc as a partial using code from above, but the logic of the code produced markup that was invalid and not semantically sound. So I rewrote it like so:
https://gist.github.com/skyzyx/a796d66f6a124f057f3374eff0b3f99a
This version intentionally only looks for
h2
…h4
. This is because the page title is theh1
, and everything else ish2
or below. I also choose to stop ath4
because the value to the reader beyond that is — in my experience — negligible.Feel free to re-adjust the regexes if you want a broader spectrum of headers.
This works a treat @skyzyx 😃
here's another twist on collapsing empty heading levels using hugo static generation vs recurring javascript overhead... the gist is to use string replace to target the empty levels... there's no conditional looping in hugo templates yet so i just applied the basic approach 3 times which covers all my markup scenarios
note the pattern on closing tag carriage returns is slightly different than opening tags
{{ $toc := .TableOfContents }}
{{ $toc := (replace $toc "<ul>\n<li>\n<ul>" "<ul>") }}
{{ $toc := (replace $toc "<ul>\n<li>\n<ul>" "<ul>") }}
{{ $toc := (replace $toc "<ul>\n<li>\n<ul>" "<ul>") }}
{{ $toc := (replace $toc "</ul></li>\n</ul>" "</ul>") }}
{{ $toc := (replace $toc "</ul></li>\n</ul>" "</ul>") }}
{{ $toc := (replace $toc "</ul></li>\n</ul>" "</ul>") }}
<!-- count the number of remaining li tags -->
<!-- and only display ToC if more than 1, otherwise why bother -->
{{ if gt (len (split $toc "<li>")) 2 }}
{{ safeHTML $toc }}
{{ end }}
{{ end }}
@Beej126 Thanks for your code. :smile: That's much better than the JavaScript approach. I've tested this for my blog and it works perfectly.
@Beej126 @VincentTam Looks like there might be potential for a loop there given that the same commands are run three times each...
@ryanwhocodes That's more elegant, but that won't save you any line because the beginning and the end take two lines.
@ryanwhocodes - it seems like the ideal loop would be a conditional check on whether the last replace had any hits... but we only get a "range" style looping in hugo so far... i.e. finite list iteration... which suggests a "split" approach to generate array... but i couldn't think of a good pattern to split on that would be reliable with nested ul-li nodes... if you can see a good strategy please suggest
So, is it supported now out of the box without modifying any partials ? Can someone please post what property to turn this on for a post md file ?
@helmbold, please no passive-aggressive comments. Nobody owes you (or anybody else) anything.
This is open-source software. If you want this feature so badly, why not offer to sponsor development with cash? Or contribute, yourself?
As a maintainer of a very popular piece of OSS software, I can speak first-hand about the difficulty of trying to handle development and support of OSS, on top of a daytime job + family time + time for myself.
Please, straighten out your perspective.
@skyzyx Yes, you're right! I've deleted my comment since I would not like to read such comments in my own project.
The code that @mikeblum and @skyzyx provided got me pretty far, but I was bitten by a pathological case; one of my pages had numerous headings that contained the same text (Don't ask. It's... it's this whole big thing), and so generated the same id when passed through | plainify | htmlUnescape
. The generated ToC only anchored to the first of these redundant headings.
Blackfriday has a workaround for this case already. It appends a counter to the end of the id
for each identically-named heading, so rather than having five my-annoying-heading
ids, it will generate my-annoying-heading
, my-annoying-heading-2
, my-annoying-heading-3
, etc..
Long story short, I re-tooled the code already shared in this thread to extract the id from the headings rather than re-generate it from the contained text, and to be a bit more verbose about the sub-\
Hope it helps. https://gist.github.com/pyrrho/1d77cdb98ba58c7547f2cdb3fb325c62
Edit [20 Nov]: @mikeblum's question about explicit heading IDs made me realize my code was deficient when the headings were anything but text. I've expanded the code linked above with a substantially more complex test set, and the ability to correctly translate markdown syntax (e.g. **strong**, _em_, [links]()), html (e.g. \<span style"color: red;">explicit blocks\), emoji, and the like to the generated <ul>
.
@pyrrho Your solution has come the closest to working for me, but I run into the errors:
error calling partial: template: theme/partials/toc.html:28:53: executing "theme/partials/toc.html" at <after 1>: error calling after: no items left
I'm reading through & trying to understand the code and why "after 1" would be failing - any ideas?
To immediately, answer you question, @xenophenes, no. I have no idea what that message is suggesting. I'd love to dig into it and try and make this snippet more robust, though. I'd ask we move that discussion to the gist, though, so we don't conflate the discussion in this issue with debugging back-and-forth. And so I have a record there of what broke. And (hopefully) how it was fixed.
After taking into account @branw 's changes (thanks by the way!) I've found some issues with how Hugo auto-generates the header ids in BlackFriday:
TOC:
<a href="/post/table-of-contents/#no-entry-sign-headers">
🚫 headers
</a>
target header:
<h1 id="not-supported">🚫 headers</h1>
I tweaked @branw's fix to at least cosmetically support emoji:
{{ $base := ($.Page.File.LogicalName) }}
{{ $anchorId := ($header | plainify | htmlUnescape | anchorize) }}
{{ $href := delimit (slice $base $anchorId) "#" | string }}
<li>
<a href="{{ relref $.Page $href }}">
{{ $header | plainify | htmlUnescape | emojify }}
</a>
</li>
and tried adding this to my config.toml:
[blackfriday]
angledQuotes = true
extensions = ["hardLineBreak"]
fractions = false
plainIDAnchors = true
but still no dice on supporting complex headers with UTF-8 nonsense in them. Is there a hook in the processing pipeline to create manual header ids? Ideally I think having the id
be generated with
{{ $anchorId := ($header | plainify | htmlUnescape | anchorize) }}
would work nicely but I'm sure there are edge cases that that doesn't take into account.
@mikeblum it looks to me like Blackfriday is stripping the UTF8( / emoji) from the generated IDs, same as it strips special ASCII characters (&
, %
, $
, etc.);
Input Markdown
## 🚫 headers &&
## 🚫 headers &&
## 🚫 headers &&
Output HTML
<h2 id="headers">🚫 headers &&</h2>
<h2 id="headers-1">🚫 headers &&</h2>
<h2 id="headers-2">🚫 headers &&</h2>
There is extended-markdown syntax for explicitly setting a heading's id, by the by; Input Markdown
## 🚫 headers {#customized-no-entry-sign-header}
Output HTML
<h2 id="customized-no-entry-sign-header">🚫 headers</h2>
This thread was really helpful for me. I also created a partial to generate a table of contents for h2
~ h4
:
https://gist.github.com/percygrunwald/043e577beb90db72e09727a3ed3053c3
I commented this one pretty heavily because I had a hard time figuring out what was going on, so it might be useful for someone not that familiar with Hugo's templating syntax. The reason I made my own is that I found the output HTML for some of the examples here was not valid (too many or too few closing tags), which caused problems when the HTML was minified.
Here's a quick preview of the outcome:
And you can see it live here.
Something not mentioned yet is a CSS only approach. It's not necessarily semantic, but has worked for my needs.
#TableOfContents > ul {
list-style: none;
margin: 0;
padding: 0;
}
I stumbled across this issue today.
This is the solution that I am currently using to get rid of the empty top-level <li>
(ie. when there is no h1
tag in the {{.Content}}
portion:
{{ $emtLiPtrn := "(?s)<ul>\\s<li>\\s<ul>(.*)</li>\\s</ul>" }}
{{ $rplcEmtLi := "<ul>$1" }}
{{ .TableOfContents | replaceRE $emtLiPtrn $rplcEmtLi | safeHTML }}
It is by no means perfect but gets the job done without any JS or too many lines of code.
The only issue I came across with this is when lower level heading tags (eg. h6
) appear before the higher level tags (eg. h5
-h1
) or if I skip a heading level in my {{.Content}}
.
However this by itself is not very common.
Since this bug is still not fixed, I've found a simple fix with CSS.
The following code hides the first, empty list item that leads to an orphaned bullet point:
#TableOfContents>ul {
padding: 0;
}
#TableOfContents>ul>li{
list-style: none;
}
Small tune of @helmbold solution (for template hello-fiends-ng
):
#TableOfContents > ul {
padding: 0;
+ margin-left: 0;
}
#TableOfContents > ul > li {
list-style: none;
}
here's another twist on collapsing empty...
this works (with limited testing) -- no need to worry about <h1>
s or CSS or anything like that
{{- $toc := .TableOfContents -}}
{{- $toc := replaceRE `<ul>\n<li>\n<ul>` `<ul>` $toc -}}
{{- safeHTML $toc -}}
I'm reopening this as I'm implementing ToC for Goldmark (the new and improved MD renderer in next Hugo).
I suggest that we fix this in a non-magic way and add some settings for toc, e.g. startLevel (inclusive, default 2
) and stopLevel (inclusive, default 3
)
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
So I'm not sure if this is at all possible, or what the best way would be to do this, but we're having some issues using the
.TableOfContents
variable in templates.There are a couple of points:
<h1>
tag per section root.<body>
-level,<h1>
tag generated in the layout template using the.Title
attribute of the page. (It works better semantically, rather than having the title in two places.).TableOfContents
, sensibly, only renders navigation for the headers in the actual content..TableOfContents
always treats<h1>
as top-level, even if there are no<h1>
-level headers.Because of this, if you comply with (1) and implement (2), and thus only have
<h2>
or lower headers in your content, the generated table of contents contains an empty top-level<nav>
as a result of (3) and (4).Example table of contents:
This messes with the page semantically since now the navigation has an empty top-level. The way I see it there are two ways to fix this:
<h2>
as top-level headers if there is no<h1>
in the content.I'm not sure if there is currently an undocumented workaround to implement either of these solutions. But if there isn't, would there be a way to allow for using either of the two solutions to achieve a more semantic table of contents?