Doc comments exported to json should not be converted to html.

flenniken commented 5 years ago

The jsondoc and jsondoc0 options generate html fragments for comments and descriptions. This is a problem because I want to generate documents formatted in other ways.

For example to generate markdown I would have to first convert the html description strings back to plain text before applying my formatting.

Is it possible to output the document comments as plain text? Then I can write code to format the user written text to my formats much easier.

Here is an example nim file with some document comments:

$ cat test.nim
## You use the hello module to display the hello message.
##
## This example is used to text the nim jsondoc and jsondoc0 options
## to generate json information from the source code documentation.

proc hello*(): string = 
  ## Return the "Hello" string.
  ##
  ## .. code-block:: nim
  ##
  ##   import hello
  ##   hello.hello()
  ##
  result = "Hello"

Here I generate some json using the jsondoc0 option:

$ nim jsondoc0 test
Hint: used config file '/Users/steve/.choosenim/toolchains/nim-0.19.4/config/nim.cfg' [Conf]
Hint: used config file '/Users/steve/.choosenim/toolchains/nim-0.19.4/config/nimdoc.cfg' [Conf]
Hint: operation successful (1706 lines compiled; 0.001 sec total; 1.004MiB peakmem; Debug Build) [SuccessX]

Here is what is generated:

$ cat test.json
[
  {
    "comment": "<p>You use the hello module to display the hello message.</p>\n<p>This example is used to text the nim jsondoc and jsondoc0 options to generate json information from the source code documentation.</p>\n"
  },
  {
    "name": "hello",
    "type": "skProc",
    "line": 6,
    "col": 0,
    "description": "Return the &quot;Hello&quot; string.<pre class=\"listing\"><span class=\"Keyword\">import</span> <span class=\"Identifier\">hello</span>\n<span class=\"Identifier\">hello</span><span class=\"Operator\">.</span><span class=\"Identifier\">hello</span><span class=\"Punctuation\">(</span><span class=\"Punctuation\">)</span></pre>",
    "code": "proc hello*(): string"
  }
]

Here is what I think should be generated:

$ cat test.want
[
  {
    "comment": "You use the hello module to display the hello message.\nThis example is used to text the nim jsondoc and jsondoc0 options to generate json information from the source code documentation.\n"
  },
  {
    "name": "hello",
    "type": "skProc",
    "line": 6,
    "col": 0,
    "description": "Return the \"Hello\" string.\n\n.. code-block:: nim\n\n  import hello\n  hello.hello()\n"
    "code": "proc hello*(): string"
  }
]

Araq commented 5 years ago

And then you parse the RST yourself? How is that better than parsing the HTML?

flenniken commented 5 years ago

Say I want to support AsciiDoc nim doc comments, or github markdown, or something else, all start from plain text. Code already exists to parse and format from that. And the structure captured in the html is not the structure I want.

I think it would be cool to write some code that generates the complete documentation for a project including the index. I have some ideas that might make it better than what's currently available. To do that I need more control over the formatting and I would like to work in plain text, as written in the source, until the end when it gets converted to one or more final output formats.

Araq commented 5 years ago

I think processing the HTML is far easier to handle than .. code-block:: nim with its indentation. HTML to plain text is easy to do on your side. Converting <b>text</b> to either text or **text** is simple.

flenniken commented 5 years ago

The simple example I gave was to illustrate the fact the the comments and descriptions appear in the json as html since it's not clear from the manual that it should behave this way.

I was planning to pass the plain text to existing libraries to do the final format conversion. So in this case it doesn't require any parsing on my part.

In the general case it is not possible to convert the html back to the original plain text without losing information. Even in the simple case above, the comment cannot round-tripped without losing a line break.

To round-trip the html in the general case I would have to reverse engineer all the possible conversions made by the restructured text conversion which isn't very appealing.

exelotl commented 2 years ago

fwiw I think having an option to output the raw RST would be very useful, as it would make it easier to use Nim with other RST-based documentation tools (e.g. Sphinx)

Araq commented 2 years ago

You already have significant possibilities via modifying nimdoc.cfg btw. I used it successfully to write slide shows and entire (currently unpublished) books.

ringabout commented 1 year ago

Duplicate of https://github.com/nim-lang/Nim/issues/21928

nim-lang / Nim

Doc comments exported to json should not be converted to html. #10696