jupyter / papyri

MIT License
84 stars 17 forks source link

Use the doctest module in get_example_data #308

Closed asmeurer closed 8 months ago

asmeurer commented 11 months ago

Fixes #282

Still several todos here:

Here's an example:

def docstring(x):
    """
    Examples
    ========

    >>> from test_mod import docstring
    >>> a = docstring(1)
    >>> a
    2

    >>> 1 + a
    3

    >>> import matplotlib.pyplot as plt
    >>> plt.plot([0, 1], [0, 1])
    >>> plt.show()

    >>> 1 + 1
    2

    >>> syntax error

    >>> 1/0 # exception
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ZeroDivisionError: division by zero

    >>> 1/0 # unexpected exception
    """
    return x + 1

__version__ = '0'
[global]
module = 'test_mod'

Generates

  "example_section_data": {
    "children": [
      {
        "type": "code",
        "value": "from test_mod import docstring\n"
      },
      {
        "type": "code",
        "value": "a = docstring(1)\n"
      },
      {
        "type": "code",
        "value": "a\n"
      },
      {
        "type": "code",
        "value": "1 + a\n"
      },
      {
        "type": "code",
        "value": "import matplotlib.pyplot as plt\n"
      },
      {
        "type": "code",
        "value": "plt.plot([0, 1], [0, 1])\n"
      },
      {
        "type": "code",
        "value": "plt.show()\n"
      },
      {
        "type": "Fig",
        "value": {
          "kind": "assets",
          "module": "test_mod",
          "path": "fig-test_mod:docstring-0-c8430bd5.png",
          "type": "RefInfo",
          "version": "0"
        }
      },
      {
        "type": "code",
        "value": "1 + 1\n"
      },
      {
        "type": "code",
        "value": "syntax error\n"
      },
      {
        "type": "code",
        "value": "1/0 # exception\n"
      },
      {
        "type": "code",
        "value": "1/0 # unexpected exception\n"
      }
    ],
asmeurer commented 10 months ago

I think the main thing that needs to be done here is now is to take a look at the generated JSON and see if we like how it looks. It's not hard to change what is there.

Carreau commented 10 months ago

To do for me:

asmeurer commented 10 months ago

You should also just review the code, and run this against the existing example configurations to make sure nothing funny is happening.

Carreau commented 10 months ago

I pushed a commit that inject a debug function instead of lambda s: None,

It seem that some of the parsing is incorrect, as I get an

$ papyri gen examples/papyri.toml --only papyri.examples:example1
...
Unexpected exception (<class 'SyntaxError'>, SyntaxError('multiple statements found while compiling a single statement', ('<doctest example1[0]>', 1, 32, 'import matplotlib.pyplot as plt\n', 1, 32)), <traceback object at 0x1202d7a40>)

Note that this debug message make it looks like the exec(compile(..., 'single')) in doctest got a line with \n et the end, but it does get a multiple line.

I'm not sure why this is happening or why the code here is wrong. I'll investigate.

Carreau commented 10 months ago

Ha, I think it considers ... as continuation always. So replacing ... with >>> in a couple of places works.

And that make me realize we should have a custom parser in IPython/testing/plugin/ipdoctest.py

Carreau commented 10 months ago

One of the remaining issue is that with this each example line appear now appear in it's own code block and that any interleaved text dispear, so we still need to do some custom parsing instead of completely delegating to doctest runner.

asmeurer commented 10 months ago

This doesn't use the doctest parser at all, just the runner. It assumes that the examples have already been parsed ahead of time. So we just need to improve the parser that is used to extract examples better. This is also sort of the same issue as the examples only being parsed in the "Examples" section.

Carreau commented 10 months ago

It does indirectly use the doctest parser as example_section_data used to contain both code and examples, but now that this is assigned to in def report_*, it only get content that get executed. So it does lose some minimal from from examples sections.

asmeurer commented 10 months ago

In the future, do not force push to other people's branches.

asmeurer commented 9 months ago

There seems to be a segfault from one of the plots in the np.sinc doctest

Fatal Python error: Segmentation fault

Current thread 0x00000002016d0240 (most recent call first):
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 43 in _wrapit
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 54 in _wrapfunc
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 2597 in cumsum
  File "<__array_function__ internals>", line 200 in cumsum
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/gridspec.py", line 193 in get_grid_positions
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/_api/deprecation.py", line 384 in wrapper
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/gridspec.py", line 665 in get_position
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/axes/_base.py", line 793 in set_subplotspec
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/axes/_base.py", line 661 in __init__
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/figure.py", line 757 in add_subplot
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/figure.py", line 1628 in gca
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/pyplot.py", line 2309 in gca
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/matplotlib/pyplot.py", line 3084 in title
  File "<doctest sinc[1]>", line 1 in <module>
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/doctest.py", line 1351 in __run
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/doctest.py", line 1498 in run
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 1335 in get_example_data
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 1655 in prepare_doc_for_one_object
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 2129 in collect_api_docs
  File "/Users/aaronmeurer/Documents/papyri/papyri/gen.py", line 558 in gen_main
  File "/Users/aaronmeurer/Documents/papyri/papyri/__init__.py", line 474 in gen
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/main.py", line 683 in wrapper
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 760 in invoke
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 1404 in invoke
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 1657 in invoke
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/core.py", line 216 in _main
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/core.py", line 778 in main
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/click/core.py", line 1130 in __call__
  File "/Users/aaronmeurer/anaconda3/envs/papyri/lib/python3.11/site-packages/typer/main.py", line 311 in __call__
  File "/Users/aaronmeurer/Documents/papyri/papyri/__main__.py", line 3 in <module>
  File "<frozen runpy>", line 88 in _run_code
  File "<frozen runpy>", line 198 in _run_module_as_main

Although there's a separate question which is why the doctests are being run at all with --no-exec.

asmeurer commented 9 months ago

I fixed the --no-exec flag. The segfault doesn't happen on main, though. I'm guessing it has something to do with with the fig managers.

Carreau commented 9 months ago

I think the culprit of set_numeric_ops (I've pushed a commit that deactivate it), which replace addition with addition mod 5 globally. It might still be a bug but as it's deprecated maybe it's not worth our time tracking it down.

I've pushed a commit that exclude just this function from being executed.

I was also able to reproduce just with

papyri gen examples/numpy.toml --no-narrative --only numpy:set_numeric_ops --only numpy:sinc
asmeurer commented 9 months ago

I've fixed the parser to be more robust for interleaving text. It now properly handles the case where text is right before an example. My main concerns now are if we are actually including everything we want in the output JSON.

In particular, the JSON output doesn't include the prompts (>>> and ...), and it doesn't include the outputs of the doctests. This is the case even when exec=false. We should presumably fix it to include the outputs, but note that this is also the case in main. For example, here's np.select in main:

  "example_section_data": {
    "children": [
      {
        "type": "code",
        "value": "x = np.arange(6)\ncondlist = [x<3, x>3]\nchoicelist = [x, x**2]\nnp.select(condlist, choicelist, 42)"
      },
      {
        "type": "code",
        "value": "condlist = [x<=4, x>3]\nchoicelist = [x, x**2]\nnp.select(condlist, choicelist, 55)"
      }
    ]

and in this branch

  "example_section_data": {
    "children": [
      {
        "type": "code",
        "value": "x = np.arange(6)\n"
      },
      {
        "type": "code",
        "value": "condlist = [x<3, x>3]\n"
      },
      {
        "type": "code",
        "value": "choicelist = [x, x**2]\n"
      },
      {
        "type": "code",
        "value": "np.select(condlist, choicelist, 42)\n"
      },
      {
        "type": "text",
        "value": "\n"
      },
      {
        "type": "code",
        "value": "condlist = [x<=4, x>3]\n"
      },
      {
        "type": "code",
        "value": "choicelist = [x, x**2]\n"
      },
      {
        "type": "code",
        "value": "np.select(condlist, choicelist, 55)\n"
      }
    ],

Compare the actual docstring:

    Examples
    --------
    >>> x = np.arange(6)
    >>> condlist = [x<3, x>3]
    >>> choicelist = [x, x**2]
    >>> np.select(condlist, choicelist, 42)
    array([ 0,  1,  2, 42, 16, 25])

    >>> condlist = [x<=4, x>3]
    >>> choicelist = [x, x**2]
    >>> np.select(condlist, choicelist, 55)
    array([ 0,  1,  2,  3,  4, 25])

Note the array([ 0, 1, 2, 3, 4, 25]) bits aren't included in the JSON anywhere.

Carreau commented 8 months ago

Ok, test are passing, let's merge and move on.