nextjournal / markdown

A cross-platform clojure/script parser for Markdown
ISC License
38 stars 4 forks source link

Drop default parsing of hashtags and internal links #14

Closed zampino closed 1 year ago

zampino commented 1 year ago

This changes the default behaviour of nextjournal.markdown/parse with respect to custom text tokenization: hashtags e.g #tag and internal links [[link] are not parsed by default any longer. Users can opt-in of them by:


(md/parse (update md.parser/empty-doc :text-tokenizers conj 
                  md.parser/internal-link-tokenizer 
                  md.parser/hashtag-tokenizer)
          "some [[set]] of [[wiki]] links with #hashtags")
;; => 
=>
{:type :doc,
 :content [{:type :paragraph,
            :content [{:type :text, :text "some "}
                      {:type :internal-link, :text "set"}
                      {:type :text, :text " of "}
                      {:type :internal-link, :text "wiki"}
                      {:type :text, :text " links with "}
                      {:type :hashtag, :text "hashtags"}]}],
 :toc {:type :toc},
 :footnotes []}

See also this notebook.

There has never been an actual agreement, what exactly hashtags and internal links should have linked to, when transformed into hiccup.

We allow to add a predicate to specify in which situation a tokenizer should be applied, see: https://github.com/nextjournal/markdown/blob/f1e71e08217aa587493916406b0795d733bb2c73/src/nextjournal/markdown/parser.cljc#L492-L495