Convert sphinx syntax to Markdown

NiklasRosenstein commented 9 years ago

It would be nice if Sphinx (reStructure Text) syntax could be converted sot that :meth:func_name`would be displayed as code. It must not necessarily link to the function, but it would be nice to have. Also, the:param name: text...and:raise ExceptionName: text..` etc. should be handled.

Also, some use this kind of syntax:

Args:
    filename (str): The filename ...
Returns:
    str: The new filename ...
Raises:
    TypeError: Foo ...

jsmith commented 6 years ago

I have this feature implemented in a fork of master at the moment! I'm using this package (and the extra features I've implemented) to compile the documentation my team has within our repository :) It supports :param name:, :returns:, :raises name: and their aliases; however, the branch doesn't support :meth or :class yet.

I see that you are going through some heavy refactoring of develop at the moment. Will you you be implementing the cross-referencing features for the default preprocessor?

NiklasRosenstein commented 6 years ago

@jacsmith21 Awesome! You are welcome to create a PR on the master branch, but I definitely want to make sure this lands in develop as well.

In develop, I still have to think about a good way of being able to use multiple preprocessors. The Sphinx preprocessor could then be separate from the standard preprocessor and both could be used at the same time.

Also in develop, the cross-references will be detected but not actually linked by the preprocessor. Currently the Preprocessor is supposed to generate another special markup for cross-references that can easily be regex-matched by the Indexer. However, I think I will use a nodal document structure in the future to avoid possible clashes with multiple preprocessor recognizing that special cross-reference markup as markup for their own purpose.

I won't do any work on cross-references in the current master anymore.

jsmith commented 6 years ago

@NiklasRosenstein Ok perfect I'm excited to merge everything in! I'll put up a PR in a few days once I do some more testing with my codebase.

Once you figure out the structure of the new preprocessor, I can adapt my implementation and push some changes to develop. Using multiple formats at the same time would be interesting. We would just have to detect the format of each section and use the appropriate preprocessor.

And ok that all makes sense. I'll make the appropriate enhancements to the restructuredText preprocessor when everything is in place! Let me know if there is anything I can help out with in the develop branch, I really like this project :)

NiklasRosenstein commented 6 years ago

@jacsmith21 In case you are interested, I just applied your implementation to the develop branch in c373b793c81c62c3dd1c4cb768c6803737796902. This is also just a note to myself of what's still to do.

It is not rock-solid yet. The Pydoc-Markdown preprocessor inserts CrossReference nodes into the "DOM" which would then split up the original Sphinx block and the Sphinx preprocessor would consider the text blocks around the node separately. Eg.:

"""
:param a: Pass an #int or #float.
:param b: Pass a #str.
"""

If the order is PdmPreproc and then SphinxPreproc, the Pydoc-Markdown preprocessor would receive this as a single Text node and split it up into the sequence

Text(":param a: ...")
CrossReference("int")
Text("or")
CrossReference("float")
Text(".\n:param b: ...")
CrossReference("str")
Text(".").

The Sphinx Preprocessor running as second would then see the four Text nodes separately and will insert the "**Arguments**:\n" text node for every :param .. that it encounters in a Text node.

One way to work around this is to have a separate Cross-reference parser that runs after Sphinx, but it would work in more scenarios if the preprocessor would just match over multiple Text nodes.

jsmith commented 6 years ago

Thanks for the update. I'm glad it's getting merged in :) Will parameter and return types also be cross-referenced? In this example: will Document be a cross-reference?

NiklasRosenstein commented 6 years ago

will Document be a cross-reference?

Yes, the type in the parentheses is intended to be converted into a cross-reference as well. :)

jsmith commented 6 years ago

That will be amazing. So sphinx defines types like this:

:type param: str
:rtype: int

Will the SphinxPreproc be responsible for converting those to a CrossReference?

NiklasRosenstein commented 6 years ago

Yes, that would also be the responsibility of the Sphinx preprocessor. I reckon the :type param: str line should be stripped completely and the information about the type should be added to the Argument description so that it is displayed the same way as the Pydoc-Markdown preprocessor would display it.

jsmith commented 6 years ago

Yeah, that would make sense. They should each have a consistent output!

florimondmanca commented 5 years ago

Hi! So what is the current situation on this issue? :)

I use PydocMd to build Markdown API reference pages from docstrings (and serve them in a Vuepress site). I was using the Numpy docstring style, so I had to adapt them to the PydocMd format, which is fine but a bit "lock-in".

So it seems to me this feature request (Sphinx syntax -> Markdown) is part of something more general, i.e. can (and should?) PydocMd support conversion from other docstring styles?

I'm not entirely sure how the processing pipeline is organized yet, but I believe building "adapters" would, in theory, be possible as other docstring styles (Numpy, Google, reStructuredText) have a lot in common and with the PydocMd format.

For example, consider the following docstrings:

Numpy:

def foo(a, b):
    """Foo a and b.

    This is a function that computes a foo combination of two integers.

    Parameters
    -----------
    a : int
        The integer A.
    b : int
        The integer B.

    Returns
    -------
    foo : int
       Foo combination of ``a`` and ``b``.

    """"

Google:

def foo(a, b):
    """Foo a and b.

    This is a function that computes a foo combination of two integers.

    Args:
        a (int): The integer A.
        b (int): The integer B.

    Returns:
        int: Foo combination of  ``a`` and ``b``.
    """"

reStructuredText:

def foo(a, b):
    """Foo a and b.

    This is a function that computes a foo combination of two integers.

    :param a: The integer A.
    :type a: int
    :param b: The integer B.
    :type b: int
    :return foo: Foo combination of ``a`` and ``b``.
    :rtype int

All of these would map to this PydocMd version:

def foo(a, b):
    """Foo a and b.

    This is a function that computes a foo combination of two integers.

    # Arguments
    a (int): The integer A.
    b (int): The integer B.

    # Returns
    foo (int): Foo combination of  `a` and `b`.
    """"

In theory that seems possible, but it does look like a lot of work just to make PydocMd a bit more "plug-and-play". There are also probably a lot of edge cases to consider, such as incompatibilities between docstring styles or between a style and the PydocMd format.

Anyway, just wanted to add my two cents here. Thanks for the great work on this tool. :)

vemel commented 5 years ago

Guys, please check out my PR https://github.com/NiklasRosenstein/pydoc-markdown/pull/84

I moved code around a bit and added two new Preprocessors

GooglePreprocessor - supports google format for docstrings
SmartPreprocessor - support both Google and RST docstrings. Actually, it just uses one or another preprocessor

Looks like SimplePreprocessor is broken as it uses missing link_lookup method, so I did not add it. Numpy should be easy to add based on my GooglePreprocessor implementation.

NiklasRosenstein commented 5 years ago

Hey guys, sorry that I've been mute on this repo in the past months. I'm planning to work more on Pydoc-markdown again. @florimondmanca I like your suggestion, and @vemel thanks for your PR!

What SimplePreprocessor are you talking about though? 😕

vemel commented 5 years ago

@NiklasRosenstein Sorry, it is basic preprocessor :) https://github.com/NiklasRosenstein/pydoc-markdown/blob/master/pydocmd/preprocessor.py#L112 - Prepocessor.link_lookup is not defined and I am not fully sure what it should be.

NiklasRosenstein commented 5 years ago

Gotcha! It was added in this PR by @paradoxxxzero. Honestly I did not review this PR well enough as I had not noticed that it was not initializing link_lookup within the class (it's a dict mapping which reference can be found in which file).

No worries, references will be resolved at render time in Pydoc-markdown 3.x and does not need to be implemented in the preprocessor (it only needs to mark sections of text as a reference).

vemel commented 5 years ago

Sooo, my suggestion:

Add BasePreprocessor
Split GooglePreprocessor and PEP257Preprocessorboth inherited fromBasePreprocessor`, they should only define keywords map and param regexps
Adapt SmartPreprocessor
Probably add NumpyPreprocessor and support it in SmartPreprocessor

NiklasRosenstein commented 5 years ago

Hey @vemel,

Splitting the PEP257Preprocessor away from GooglePreprocessor sounds sane, but the differences appear to be minor so I'd be fine with the same processor being able to handle both.

I'll keep focusing on 3.x though so it'd be awesome if we could have additions like the Numpy preprocessor on the develop branch 😀 (I'll port your latest additions to 3.x soon and you could take them as a reference for how to implement the Numpy preprocessor?)

Thanks for your support! Cheers,

vemel commented 5 years ago

@NiklasRosenstein sounds awesome! BTW, feel free to include lazy type annotations support to 3.x https://github.com/NiklasRosenstein/pydoc-markdown/issues/86 should be easy to add and I checked that it is working on a relatively big project with no issues.

tmbdev commented 4 years ago

Any progress on this? Handling ":param:" is pretty important. Right now, I don't see any way of preventing parameter lists from flowing into an unreadable paragraph short of adding bullet points.

NiklasRosenstein commented 4 years ago

Hey @tmbdev you should be able to use the RstPreprocessor for this in pydoc-markdown 2.1.0. Can't give it a try right now, but if you have a specific issue with this can you elaborate?

Just closing this issue btw. because, as far as I can see, the original issue was addressed (The RstPreprocessor in Pydoc-markdown 2.1.0, and there is the respective functionality on the develop branch as well).

Please feel free to comment here if you are having issues to use the RstPreprocessor

NiklasRosenstein commented 4 years ago

Hey @tmbdev , here's an example

$ cat <<EOF >a.py
def func(a, *args, key=None, **kwargs):
  """
  A function.

  :param a: Some argument.
  :param args: Another argument.
  :param key: Key.
  :param kwargs: Arbitrary keyword args.
  :return: Foobar
  """
EOF
$ cat <<EOF >pydocmd.yml
generate:
  - index.md: a+
pages: []
preprocessor: pydocmd.preprocessors.rst.Preprocessor
EOF
$ pydocmd generate
$ cat _build/index.md
<h1 id="a">a</h1>

<h2 id="a.func">func</h2>

```python
func(a, *args, key=None, **kwargs)

A function.

Arguments:

a: Some argument.
args: Another argument.
key: Key.
kwargs: Arbitrary keyword args.

Returns:

Foobar

NiklasRosenstein / pydoc-markdown

Convert sphinx syntax to Markdown #1