ocaml / omd

extensible Markdown library and tool in "pure OCaml"
ISC License
156 stars 45 forks source link

Add markdown printer #265

Open yasunariw opened 2 years ago

yasunariw commented 2 years ago

See #231.

This PR adds:

  1. A printer that transforms the AST back into Markdown text (Omd.to_markdown)
  2. A CLI flag --mode to use Markdown instead of HTML as the output mode.

The printer that existed in the 1.3.1 version was removed, so this PR is a rewrite for the 2.0 AST. The add_string_escape_chars function is reused from the previous version by @pw374 .

So far, its behavior has been tested against manually constructed examples that cover all parts of the syntax. Please let me know if there is a robust way to add tests to the current setup. Thanks!

patricoferris commented 2 years ago

Cool feature @yasunariw !

I worked on something similar and you can see my findings here: https://github.com/ocaml/omd/issues/223#issuecomment-893821434. I think the conversion from string -> Omd.doc losses information so it is quite tricky to implement a to_markdown function. For example, there are some corner cases such as:

utop # Omd.of_string "_*hello*_" |> Omd.to_markdown;;
- : string = "**hello**\n"

Note this is after pinning e60a510a54da4c3ed63da34fe6a35f6287570989. Here, it is valid markdown to nest emphasis using the two different characters, but after using the Omd.to_markdown function it has been transformed into strong text.

What might be useful is the (very hacked together) testsuite I was running here: https://github.com/patricoferris/omd/tree/omd-print/tests. The idea was to parse the markdown normally and get the HTML, then use the to_markdown function and parse that output to HTML and compare the two HTML outputs. It is probably not full-proof but it was how I started finding these edge cases.

yasunariw commented 2 years ago

Thanks for all the pointers @patricoferris ! I'll see what else I catch using your test method and report back.

shonfeder commented 2 years ago

Thanks for the work on this important feature, @yasunariw and @patricoferris.

Please let me know if there is a robust way to add tests to the current setup. Thanks!

The most robust way I can think of is via some property based testing. If we had generator that constructed valid arbitrary Omd.doc values, then we could check that forall x : Omd.doc, (x |> Omd.to_markdown |> Omd.of_string) = x. This should catch the kind of edge case that @patricoferris has pointed out, iiuc.

My inclination is to suggest that we should expect any mergable solution to this feature to depend on the solution implemented for #223.