Closed drrajeshtalluri closed 4 years ago
If I'd to add this kind of feature to the package then avoiding having foreign data structures (JSON) used to mirror document structure would the most ideal. Here's a quick sketch of how I'd do this:
function toc(ast::CommonMark.Node)
io = IOBuffer()
for (node, enter) in ast
if enter && node.t isa CommonMark.Heading
t = node.t
node.t = CommonMark.Paragraph()
link = string("[", rstrip(markdown(node)), "](#", node.meta["id"], ")")
node.t = t
println(io, " "^(t.level-1), " * ", link)
end
end
return Parser()(seekstart(io))
end
which just builds a nested markdown list with embedded links to the headings and mirrors the levels of the headers themselves and maintains any text formatting found in the headers. It won't be the most efficient way to do it, since we're printing raw markdown to a buffer and then reparsing, but it's definitely the simplest and most understandable I can come up with.
With regards to your implementation:
hstr = string(node.t)
hlvl = match(r"CommonMark.Heading\((.)\)", hstr)
Just use node.t.level
for getting the Int
describing the header level rather than using a regex.
title_str = ""
for (node, enter) in node
if enter && node.t isa CommonMark.Text
title_str *= node.literal
end
end
This'll strip the formatting, which I assume you're happy with doing, if not, then printing to a specific format to store, or serialising the node to JSON would need to be done.
Thanks so much! This is a great idea. I was trying to reconnect the heading nodes together by using your defined node type but could not figure out how to exactly do that. But the code you provided is even better as it gives the toc in ul/li form which is what is needed for this package, and is what is provided in pandoc and similar parsers. I just needed a hierarchical ast to parse, which we get from your code. Thank you very much for helping me out.
If I may ask about an unrelated feature, I saw that commonmark supports custom tag names. How would I go about changing the tag names and attributes for the nodes in the ast? For example, in the generated toc, if I wanted to change the tag name of <ul> </ul>
node created to a custom <u-list> </u-list>
. Is there a way to accomplish this by modifying the node information? I could not figure out if there is a custom node type, where we can define the node tag name and attributes for the node.
If I may ask about an unrelated feature, I saw that commonmark supports custom tag names.
Could you point me at which implementations have custom tag support? I've not implemented that with CommonMark.jl since I didn't notice it when originally porting from commonmark.js. ul
and all others are currently hard-coded into the output functions and so can't be replaced.
One thing you could do to get part-way there is to attach some class attributes to the root of the list when generating it.
{.toc}
- first item
This attaches a CSS class toc
to the outermost list in the table of contents. You can then target that class with your custom CSS and JS. That's the route I've been taking in https://github.com/MichaelHatherly/Publish.jl for customising the generic markdown elements, it's been working pretty well so far.
(Or have I misread your question completely?)
I think the tags are only supported in raw html.
6.8 Raw HTML Text between < and > that looks like an HTML tag is parsed as a raw HTML tag and will be rendered in HTML without escaping. Tag and attribute names are not limited to current HTML tags, so custom tags (and even, say, DocBook tags) may be used.
Custom classes work well to target JS and CSS. I was just wondering that if we had the ability for custom nodes, we could avoid javascript later on as we can create the custom html structure in commonmark.
I was just thinking about how to extend the spec for new elements. Instead of predefining CommonMark.Type for each new type we could have a CommonMark.CustomNode type, with additional type information in the node meta field. We could have a general writer targeted for html or latex for these custom types. Just like the ability to add classes I thought it would be cool to add tags. I do not know if this fits in this package as this is following the Commonmark spec. Just thought I would ask and get your thoughts.
There is an undocumented feature of AttributeRule
that could be used for something along these lines:
julia> p = enable!(Parser(), AttributeRule())
Parser(Node(CommonMark.Document))
julia> text =
"""
{:ul-list}
- one
- two
- three
"""
"{:ul-list}\n - one\n - two\n - three\n"
julia> ast = p(text)
● one
● two
● three
julia> ast.first_child.nxt.meta
Dict{String,Any} with 1 entry:
"element" => "ul-list"
The shorthand attribute syntax {:name}
adds element=name
metadata to the node. This could be hooked up to the output writers to customise the resulting node types I guess. It's relatively lightweight syntax without having to invent brand new syntaxes for each custom element type.
Not too sure though, hence why it's remained undocumented.
Thanks, this is perfect, I will try to use the AttributeRule
. In regards to this issue, as you already created the function toc to generate the table of contents, you can close this issue.
If possible, an example to use the toc for people who want this feature could be helpful.
body = html(ast)
toc_html = html(toc(ast))
content = "<head></head><body><div>$toc_html</div>$body</body>"
Thanks so much for your help!
Hi, I have been trying to implement a toc (table of contents) feature using the ast generated from Commonmark. However, my implementation is too convoluted and messy. It would really help if I can get any feedback on better ways to create the toc and if I could contribute this in any way to this excellent package.
The logic of my implementation is to create a hierarchical JSON data structure from ast created by CommonMark. I wanted to have it in JSON, as we can use the JSON data in javascript to create different types of toc's at multiple levels on a webpage. If it is just HTML ul,li list then there may be other better approaches.
This is the function I wrote to create a hierarchical dict:
Example usage
Any feedback on implementation or other alternatives is appreciated. Thank You!