seikichi / restructured

Pure JavaScript reStructuredText parser
58 stars 15 forks source link

can't parse nested RST inside directive #5

Open mbroadst opened 7 years ago

mbroadst commented 7 years ago

Given the following example RST:

.. method:: db.auth()

   Allows a user to authenticate to the database from within the
   shell.

   The :method:`db.auth()` method can accept either:

   - the username and password.

     .. code-block:: javascript

        db.auth( <username>, <password> )

   - a user document that contains the username and password, and
     optionally, the authentication mechanism and a digest password
     flag.

   .. include:: /includes/apiargs/method-db.auth-param.rst

The current output from restructured is:

mbroadst@gorgor:~/Development/node/restructured (master %)$ ./bin/restructured.js < ./test.rst | jq
{
  "type": "document",
  "children": [
    {
      "type": "directive",
      "directive": "method",
      "children": [
        {
          "type": "text",
          "value": "db.auth()"
        },
        {
          "type": "text",
          "value": "Allows a user to authenticate to the database from within the"
        },
        {
          "type": "text",
          "value": "shell."
        },
        {
          "type": "text",
          "value": "The :method:`db.auth()` method can accept either:"
        },
        {
          "type": "text",
          "value": "- the username and password."
        },
        {
          "type": "text",
          "value": "  .. code-block:: javascript"
        },
        {
          "type": "text",
          "value": "     db.auth( <username>, <password> )"
        },
        {
          "type": "text",
          "value": "- a user document that contains the username and password, and"
        },
        {
          "type": "text",
          "value": "  optionally, the authentication mechanism and a digest password"
        },
        {
          "type": "text",
          "value": "  flag."
        },
        {
          "type": "text",
          "value": ".. include:: /includes/apiargs/method-db.auth-param.rst"
        }
      ]
    }
  ]
}

It makes the incorrect assumption that all data inside the directive is just a text element, and therefore ceases to continue deeply parsing potentially more RST.

If I remove the following line, then the output seems more logical:

mbroadst@gorgor:~/Development/node/restructured (master *%)$ ./bin/restructured.js < ./test.rst | jq
{
  "type": "document",
  "children": [
    {
      "type": "directive",
      "directive": "method",
      "children": [
        {
          "type": "text",
          "value": "db.auth()"
        }
      ]
    },
    {
      "type": "block_quote",
      "children": [
        {
          "type": "paragraph",
          "children": [
            {
              "type": "text",
              "value": "Allows a user to authenticate to the database from within the\n"
            },
            {
              "type": "text",
              "value": "shell.\n"
            }
          ]
        },
        {
          "type": "paragraph",
          "children": [
            {
              "type": "text",
              "value": "The "
            },
            {
              "type": "interpreted_text",
              "role": "method",
              "children": [
                {
                  "type": "text",
                  "value": "db.auth()"
                }
              ]
            },
            {
              "type": "text",
              "value": " method can accept either:\n"
            }
          ]
        },
        {
          "type": "bullet_list",
          "bullet": "-",
          "children": [
            {
              "type": "list_item",
              "children": [
                {
                  "type": "paragraph",
                  "children": [
                    {
                      "type": "text",
                      "value": "the username and password.\n"
                    }
                  ]
                },
                {
                  "type": "directive",
                  "directive": "code-block",
                  "children": [
                    {
                      "type": "text",
                      "value": "javascript"
                    }
                  ]
                },
                {
                  "type": "block_quote",
                  "children": [
                    {
                      "type": "paragraph",
                      "children": [
                        {
                          "type": "text",
                          "value": "db.auth( <username>, <password> )\n"
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "type": "list_item",
              "children": [
                {
                  "type": "paragraph",
                  "children": [
                    {
                      "type": "text",
                      "value": "a user document that contains the username and password, and\n"
                    },
                    {
                      "type": "text",
                      "value": "optionally, the authentication mechanism and a digest password\n"
                    },
                    {
                      "type": "text",
                      "value": "flag.\n"
                    }
                  ]
                }
              ]
            }
          ]
        },
        {
          "type": "directive",
          "directive": "include",
          "children": [
            {
              "type": "text",
              "value": "/includes/apiargs/method-db.auth-param.rst"
            }
          ]
        }
      ]
    }
  ]
}

However, that's because its just returning null and bailing from processing the DirectiveBlock (I think, sorry I haven't used pegjs before it's been a bit confusing to read through)

mbroadst commented 7 years ago

Unfortunately, while that trick sort of works it fails in any more complicated cases. For instance a nested directive such as:

.. method:: db.auth()

   Allows a user to authenticate to the database from within the
   shell.

   - a user document that contains the username and password, and
     optionally, the authentication mechanism and a digest password
     flag.

     .. code-block:: javascript

        db.auth( {
           user: <username>,
           pwd: <password>,
           mechanism: <authentication mechanism>,
           digestPassword: <boolean>
        } )

results in most of the code-block directive ending up in an unknown bucket. I tried a number of ideas including adding a rule for the DirectiveBlock to support BodyElement+ / FailbackIndent$, however this didn't have any effect. I'm a bit confused by the multitude of indentation rules here..

@seikichi Do you have any thoughts on how best to move forward?