Open kne42 opened 6 years ago
numpydoc.docscrape
includes the classes FunctionDoc
and ClassDoc
to parse NumPy documentation into a mapping object:
In [1]: from numpydoc.docscrape import FunctionDoc, ClassDoc
In [2]: def foo(arg1, arg2):
...: """Short Description
...: More Short Description...
...:
...: Expanded Description
...:
...: Parameters
...: ----------
...: arg1 : int
...: description of arg1
...: arg2 : int
...: description of arg2d
...:
...: Returns
...: -------
...: ret : int
...: description of ret
...:
...: Notes
...: -----
...: Additional Notes
...:
...: Raises
...: ------
...: Exception
...: When it will be raised.
...:
...: References
...: ----------
...: .. [1] ref1
...: .. [2] ref2
...: """
...: pass
...:
In [3]: doc = FunctionDoc(foo)
In [4]: [key for key in doc.keys()]
Out[4]:
['Signature',
'Summary',
'Extended Summary',
'Parameters',
'Returns',
'Yields',
'Raises',
'Warns',
'Other Parameters',
'Attributes',
'Methods',
'See Also',
'Notes',
'Warnings',
'References',
'Examples',
'index']
In [5]: doc['Signature']
Out[5]: 'foo(arg1, arg2)'
In [6]: doc['Summary']
Out[6]: ['Short Description', 'More Short Description...']
In [7]: doc['Extended Summary']
Out[7]: ['Expanded Description']
In [8]: doc['Parameters']
Out[8]:
[('arg1', 'int', ['description of arg1']),
('arg2', 'int', ['description of arg2d'])]
In [9]: doc['References']
Out[9]: ['.. [1] ref1', '.. [2] ref2']
In [10]: doc['Raises']
Out[10]: [('Exception', '', ['When it will be raised.'])]
This is super crazy. =D The main thing will be figuring out the types in the Parameters
section. When more than one type is accepted, the format becomes a little more free-form, at least in skimage. But, in the first instance, we could annotate known types, and simply omit unknown ones (with warnings somehow that would allow human curation).
@jni As promised, here is a (very) rough prototype of autotyping: https://gist.github.com/kne42/93f8a9d09aa8c5ea1160f1b18a51f2d9. Sorry for the delay; I forgot I had some elements of a project due lol.
Since the nbviewer appears to be timing out due to the length of the notebook, here are the results (edited for clarity):
Function: gaussian
Name: image
Original: array-like
Parsed: Array
Name: sigma
Original: scalar or sequence of scalars, optional
Parsed: Optional[Union[Scalar, Sequence]]
Name: output
Original: array, optional
Parsed: Optional[Array]
Name: mode
Original: {'reflect', 'constant', 'nearest', 'mirror', 'wrap'}, optional
Parsed: Optional[Set]
Name: cval
Original: scalar, optional
Parsed: Optional[Scalar]
Name: multichannel
Original: bool, optional (default: None)
Parsed: Optional[Boolean]
Name: preserve_range
Original: bool, optional
Parsed: Optional[Boolean]
Name: truncate
Original: float, optional
Parsed: Optional[Float]
Plan of attack for parsing types out of strings:
optional
keywordChecklist w/ working notebook (infrastructure is complete):
array
(ndarray
, array-like
, np.ndarray
, etc.)array of ...
... array
{'reflect', 'constant', 'mean'}
https://gist.github.com/kne42/93f8a9d09aa8c5ea1160f1b18a51f2d9
@jni Okay most of the main types have been done. How do you think we should encode specific types of data/are there any more types that I haven't covered?
Here is the output (again cleaned up a bit):
Function: gaussian
Name: image
Original: array-like
Parsed: Array[~Number]
Name: sigma
Original: scalar or sequence of scalars, optional
Parsed: Optional[Union[~Scalar, Sequence[~Scalar]]]
Name: output
Original: array, optional
Parsed: Optional[Array[~Number]]
Name: mode
Original: {'reflect', 'constant', 'nearest', 'mirror', 'wrap'}, optional
Parsed: Optional[Mode]
Name: cval
Original: scalar, optional
Parsed: Optional[~Scalar]
Name: multichannel
Original: bool, optional (default: None)
Parsed: Optional[~Boolean]
Name: preserve_range
Original: bool, optional
Parsed: Optional[~Boolean]
Name: truncate
Original: float, optional
Parsed: Optional[~Real]