Open kevin-lee opened 9 years ago
I have the same problem. As example:
# Some Header (TBD)
# Some :
@stavytskyi That's fixed #162. It's just not released yet. I hope it will happen soon. <= WRONG! I'm sorry.
@Kevin-Lee thank you for information. I hope release will be soon.
@stavytskyi Oops! Sorry I was wrong. I thought this issue was #159. This is is not fixed yet unfortunately.
I believe this mostly works with new EXTANCHORLINKS
option, though all non alphanumeric characters in header are just ignored for anchor name (unlink what github does).
@Kevin-Lee, the question I have is how do you know what the generated anchor link is like so you can link to it in markdown, without looking at the generated HTML?
If you can't easily predict the generated anchor then it is quite useless except for making the header go to top of page when you click on the header. Seems like not much use out of that.
However, these anchor links are useful for creating your own links within the page, but that means you need to know what that anchor will be based on the header.
I for one would not care for the fancy algo that I can't predict without looking at the HTML.
@vsch You can use a preview feature to get the link. You need to check what it actually is sometimes. For instance, if a header contains some chars which are invalid for URL. I think it is harder to remember the rules of which are valid for header is more difficult than using preview to check out what the actual link is.
I am working on this right now and as far as I have experimented with GitHub and the logic is simple:
-#
where # is 1...., for subsequent references.No attempt is made to eliminate conflicts with future headers that may clash. For example for headers appearing in the order given with text: abcd
, abcd
, abcd
, abcd-1
Will generate references to: abcd
, abcd-1
, abcd-2
, abcd-1
I already made changes to pegdown to make it easier to extend the ToHtmlSerializer without needing to re-implement a big chunk of it. The ref ids get computed at the top most RootNode visit by calling a member function you can override in your implementation. If not overridden you get current behaviour: i.e. regression tests pass. This way, out of the box pegdown can be customized without needed to play with the parser or the HtmlSerializer.
Here are the notes I am adding to CHANGELOG since I won't have time to update java docs, I hope someone else will be up to it, or users can refer to the source which is always the most reliable reference. ;)
NOTE: these changes are backwards compatible if you don't override the new functions or you can override them to customize the HTML output or leave it as is to get the old behaviour.
preview(Node node, String tag, Attributes attributes, boolean tocGenerationVisit)
to ToHtmlSerializer
, called before every node that has a tag so that derived classes can modify attributes output in the HTML, returned attributes will be output in the order they were added to Attributes. Re-use the passed in parameter or create a new one. Added Printer.preview(node, String tag, attributes, boolean tocGenerationVisit)
to Printer so that the serializer's preview() can be accessed by plugins and verbatim serializers. Together with Attributes methods you can change classes, add attributes based on tags and node parameters without rewriting the whole serializer.
Note: tocGenerationVisit
will be true if the output is for TOC rendering. You will get the same nodes once with tocGenerationVisit
true and once false for nodes that are part of TOC headers. In all cases the passed in attributes contain the default attributes as they are now rendered by ToHtmlSerializer. If you return them unmolested, you will get output as it is now.
ToHtmlSerializer.printTaskListItemMarker(Printer printer, TaskListNode node, boolean isParaWrapped)
that prints the task list item marker, default prints an input checkbox, isParaWrapped
is true when the contents of li
tag are wrapped in <p></p>
just in case it makes a difference to what you want to output. DefaultVerbatimSerializer
to also print attributes returned by the call to preview()
before closing the <code tag. Passed in attributes will contain the class of the node.getType() value. String computeHeaderId(HeaderNode node, AnchorLinkNode anchorLinkNode, String headerText)
to ToHtmlSerializer
called before generating any HTML in the top most RootNode processing for all headers, depth first traversal. Returning an empty string will output Header without id attribute. Returning any other value will output header with that id and if there is an anchor link it will also change its reference and name attribute to match the returned value. Use this to override how anchor link references are generated. Additional benefit if you override this function is that you will always know what id to expect for the header and can generate the right reference.This way you can create your own logic for generating the reference link ids to match your requirements by overriding only a single member of ToHtmlSerializer. Similarly, to change the way task lists are generated it is also a single override.
In addition, another member override: preview(...)
that passes the node, the node's tag and current attribute set will allow you to add/remove/append class or any other attribute. By default the class that implements Attributes
can handle multiple calls to add(name, value) with the same name. It will append value as a space delimited list. So out of the box you can just add("class", whateverClass) to attributes for nodes of your choice to add extra classes.
This function is also called for VerbatimNode from the DefaultVerbatimSerializer so you can add/remove/change the class assigned to <code>
.
Apologies if this is a bit off topic
1. map: ' ' -> '-', '-' -> '-', '_' -> '_', 0..9 -> 0..9, a..z -> a..z, A..Z -> a..z, all other characters are ignored.
I found this issue on google, when trying to make an anchor tag link. The all other characters are ignored
part helped me fix my issue.
[@my-package/foo](#my-packagefoo)
-> # @my-package/foo
was the correct way to map it
Thanks @vsch !
There are two issues.
ANCHORLINKS
does not work.First Issue (No link for some special chars)
For the first issue, I've tested with all the special chars from my keyboard.
DEFINITIONS
is set as well,# Some : Heading
has no link (doesn't work for:
colon).Second Issue
All the links generated for all the headers I put above are exactly the same. They are
#some-heading
.What GitHub Does
On GitHub, it looks like this. It all works. GitHub markdown replaces a space char with
-
so the link for a case like# Some - Heading
becomessome---heading
(triple-
s) and#Some _ Heading
becomessome-_-heading
. If the char can't be used in the URL, it is removed then a sequential number is added to the end so each case has different link name.e.g.)