Closed serhiy-storchaka closed 9 years ago
Here are two patches which implementation two different interface for same feature.
In first patch you can use *doc and *field_docs arguments to specify namedtuple class docstring and field docstrings. For example:
Point = namedtuple('Point', 'x y',
doc='Point: 2-dimensional coordinate',
field_docs=['abscissa', 'ordinate'])
In second patch you can use *doc argument to specify namedtuple class docstring and *field_names argument can be a sequence of pairs: field name and field docstring. For example:
Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
doc='Point: 2-dimensional coordinate')
What approach is better?
Feel free to correct a documentation. I know that it need a correction.
I don't think it is worth complicating the API for this. There have been zero requests for this functionality. Even the doc field of property() is rarely used.
What is wrong with the following?
class Point(namedtuple('Point', 'x y')):
"""A 2-dimensional coordinate
x - the abscissa
y - the ordinate
"""
This seems more clear to me. namedtuple is in some ways a quick-and-dirty type, essentially a more true implementation of the intended purpose of tuple. The temptation is to keep adding on functionality but we should resist until there is too much imperative. I don't see it here. While I don't have a gauge of how often people use (or would use) docstrings with nametuple, I expect that it's relatively low given the intended simplicity of namedtuple.
Yes, we can use inheritance trick/idiom to specify a class docstring. But there are no way to specify attribute docstrings.
I encountered this when rewriting some C implemented code to Python. PyStructSequence allows you to specify docstrings for a class and attributes, but namedtuple does not.
I don't think it is worth complicating the API for this. There have been zero requests for this functionality. Even the doc field of property() is rarely used.
+1
I think this should be rejected and closed since the 'enhancement' looks worse to me than what we can do now.
Most data attributes cannot have individual docstrings, so I expect the class docstring to list and possibly explain the data attributes.
In the process of responding to bpo-16670, I finally read the namedtuple doc. I notice that it already generates default one-line .__doc__ attributes for both the class and properties. For Point, the class docstring is 'Point(x, y)', which will often be good enough.
If the person creating the class does not think this sufficient, the replacement is likely to be multiple lines. This is awkward for a constructor argument. There is a reason we put docstrings *after the header, not *in the header.
The class docstring is easily replaced by assignment. So I would write Eric's example as
Point = namedtuple('Point', 'x y')
Point.__doc__ = '''\
A 2-dimensional coordinate
x - the abscissa y - the ordinate'''
This does not create a second new class and is not a 'trick'.
+1, Terry
- Most data attributes cannot have individual docstrings, so I expect the class docstring to list and possibly explain the data attributes.
But almost all PyStructSequence field have individual docstrings.
This does not create a second new class and is not a 'trick'.
Thanks for the tip.
I presume that is why property docstrings are not used much.
Indeed, only 84 of 336 Python implemented properties have docstrings . However this is even larger percent than for methods (about 8K of 43K). And 100 of 115 PyStructSequence field have docstrings.
I think Python should have more docstrings, not less.
I don't know if it's worth reopening this, but I had a need for generating docs including attribute docstrings for a namedtuple class using Sphinx, and I noticed a few things...
(1) Regarding there not being demand: There's a StackOverflow question for this with 17 "ups" on the question and 22 on the best answer: http://stackoverflow.com/questions/1606436/adding-docstrings-to-namedtuples-in-python
(2) The default autodocs produced by sphinx look dreadful (e.g. https://www.dropbox.com/s/nakxsslhb588tu1/Screenshot%202013-12-04%2013.29.13.png) -- note the duplication of the class name, the line break before the signature, and the listing of attributes in alphabetical order with useless boilerplate. Here's what I would *like* to produce: (though there's probably too much whitespace :-): https://www.dropbox.com/s/j11uismbeo6rrzx/Screenshot%202013-12-04%2013.31.44.png
(3) In Python 2.7 you can't assign to the __doc__ class attribute.
I would really appreciate some way to set the docstring for the class as a whole as well as for each property, so they come out correct in Sphinx (and help()), preferably without having to manually assign doc strings or write the class by hand without using namedtuple at all. (The latter will become very verbose, each property has to look like this:
@property
def handle(self):
"""The datastore handle (a string)."""
return self[1]
)
Serhiy: I am not familiar with C PyStructSequence and how an instance of one appears in Python code. I agree that more methods should have docstrings.
Guido:
I posted on SO the simple Py 3 solution that replaces the previously posted wrapper solutions needed for Py 2.
Much of what you do not like is standard Sphinx/help behavior that would be unchanged by Serhiy's patch. The first line for a class is always "class \<classname>(\<baseclasses>)". The first line is followed by the docstring, so the class name is repeated if and only if it is repeated in the docstring (as for list, see below). The \_new/init__ signature is given here if and only it is in the docstring. Otherwise, one has to look down for the method. The method signatures are never on the first line. Examples:
>>> help(list)
Help on class list in module builtins:
class list(object)
| list() -> new empty list
| list(iterable) -> new list initialized from iterable's items
...
>>> class C:
"doc string"
def __init__(self, a, b): pass
>>> help(C)
Help on class C in module __main__:
class C(builtins.object)
| doc string
|
| Methods defined here:
|
| __init__(self, a, b)
...
I am still of the opinion that property usage should be a mostly transparent implementation detail. Point classes could have 4 instance attributes: x, y, r, and theta, with a particular implementation using 0 to 4 properties. All attributes should be documented regardless of the number of properties, which currently means listing them in the class docstring. A library could have more than one than one implementation.
As for named tuples, I believe (without trying) that the name to index mapping could be done with __gettattr__ and a separate dict. If so, there would be no property docstrings and hence no field docstrings to worry about ;-). ---
There have been requests for data attribute docstrings (without the bother and inefficiency of replacing a simple attribute with a property). Since such a docstring would have to be attached to the fixed attribute name, rather than the variable attribute value, I believe a string subclass would suffice, to be used as needed. The main problem is a decent syntax to add a docstring to a simple (assignment) statement.
If the general problem were solved, I would choose Serhiy's option B for namedtuple.
On Wed, Dec 4, 2013 at 5:40 PM, Terry J. Reedy \report@bugs.python.org\ wrote:
- I posted on SO the simple Py 3 solution that replaces the previously posted wrapper solutions needed for Py 2.
Thanks, that will give people some pointers for Python 3. We need folks to upvote it. :-)
- Much of what you do not like is standard Sphinx/help behavior that would be unchanged by Serhiy's patch. The first line for a class is always "class \<classname>(\<base_classes>)".
Maybe for help(), but the Sphinx docs look better for most classes. Compare my screen capture with the first class on this page: https://www.dropbox.com/static/developers/dropbox-python-sdk-1.6-docs/index.html The screen capture looks roughly like this (note this is two lines and the word DatastoreInfo is repeated -- that wasn't line folding):
class dropbox.datastore.DatastoreInfo DatastoreInfo(id, handle, rev, title, mtime)
whereas for non-namedtuple classes it looks like this:
class dropbox.client.DropboxClient(oauth2_access_token, locale=None, rest_client=None)¶
I understand that part of this is due to the latter class having an __init__ with a reasonable docstring, but the fact remains that namedtuple's default docstring produces poorly-looking documentation.
The first line is followed by the docstring, so the class name is repeated if and only if it is repeated in the docstring (as for list, see below). The __new/init__ signature is given here if and only it is in the docstring. Otherwise, one has to look down for the method. The method signatures are never on the first line. Examples:
>>> help(list) Help on class list in module builtins:
class list(object) | list() -> new empty list | list(iterable) -> new list initialized from iterable's items ... >>> class C: "doc string" def __init__(self, a, b): pass
>>> help(C) Help on class C in module __main__:
class C(builtins.object) doc string Methods defined here: __init__(self, a, b) ...
Yeah, help() is different than Sphinx. (As a general remark I find the help() output way too verbose with its endless listing of all the built-in behaviors.)
- ?? Python 3 has many improvements and we will add more. ---
I am still of the opinion that property usage should be a mostly transparent implementation detail.
What does that mean?
Point classes could have 4 instance attributes: x, y, r, and theta, with a particular implementation using 0 to 4 properties. All attributes should be documented regardless of the number of properties, which currently means listing them in the class docstring. A library could have more than one than one implementation.
For various reasons (like consistency with other classes) I *really* want the property docstrings on the individual properties, not in the class docstring. Here's a screenshot of what I want:
https://www.dropbox.com/s/70zfapz8pcz9476/Screenshot%202013-12-04%2019.57.36.png
I obtained this by abandoning the namedtuple and hand-coding properties -- the resulting class uses 4 lines (+ 1 blank) of boilerplate per property instead of just one line of docstring per property.
As for named tuples, I believe (without trying) that the name to index mapping could be done with __gettattr__ and a separate dict. If so, there would be no property docstrings and hence no field docstrings to worry about ;-).
I'm not sure what you are proposing here -- a patch to namedtuple or a work-around? I think namedtuple is too valuable to abandon. It not only saves a lot of code, it captures the regularity of the code. (If I have a class with 5 similar-looking methods it's easy to overlook a subtle difference in one of them.)
---
There have been requests for data attribute docstrings (without the bother and inefficiency of replacing a simple attribute with a property). Since such a docstring would have to be attached to the fixed attribute name, rather than the variable attribute value, I believe a string subclass would suffice, to be used as needed. The main problem is a decent syntax to add a docstring to a simple (assignment) statement.
Sphinx actually has a syntax for this already. In fact, it has three: it allwos a comment before or on the class variable starting with "#:", or a docstring immediately following. Check out this documentation for the autodoc extension: http://sphinx-doc.org/ext/autodoc.html#directive-autoattribute
If the general problem were solved, I would choose Serhiy's option B for namedtuple.
If you're referring to this:
Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
doc='Point: 2-dimensional coordinate')
I'd love it!
I find the help() output way too verbose with its endless listing of all the built-in behaviors.)
Then you might agree to a patch, on a separate issue. Let's set help aside for the moment.
I am familiar with running Sphinx on .rst files, but not on docstrings. It looks like the docstrings use .rst markup. (Is this allowed in the stdlib?) (The output looks good enough for a first draft of a tkinter class/method reference, which I would like to work on.)
I understand that part of this [signature after class name] is due to the latter class having an __init__ with a reasonable docstring
If dropbox.client is written in Python, as I presume, then I strongly suspect that the signature part of class dropbox.client.DropboxClient( oauth2_access_token, locale=None, restclient=None) comes from an inspect module method that examines the function attributes other than .\_doc. If so, DropboxClient.__init docstring is irrelevant to the above. You could test by commenting it out and rerunning the doc build.
The inspect methods do not work on C-coded functions (unless Argument Clinic has fixed this for 3.4), which is why signatures are put in the docstrings for C-coded objects. For C-coded classes, it is put in the class docstring rather than the class.__init__ docstring.
but the fact remains that namedtuple's default docstring produces poorly-looking documentation.
'x.__init__(...) initializes x; see help(type(x)) for signature'
This is standard boilerplate for C-coded .__init.__doc. Raymond just copied it.
>>> int.__init__.__doc__
'x.__init__(...) initializes x; see help(type(x)) for signature'
>>> list.__init__.__doc__
'x.__init__(...) initializes x; see help(type(x)) for signature'
I will try to explain 'property transparency/equivalence' in another post, when I am fresher, and after reading the autodoc reference, so you can understand enough to agree or not. My reference to a possible alternate implementation of named tuple was part of the failed explanation of 'property transparency'. I am not proposing a change now.
On Wed, Dec 4, 2013 at 10:25 PM, Terry J. Reedy \report@bugs.python.org\ wrote:
I am familiar with running Sphinx on .rst files, but not on docstrings. It looks like the docstrings use .rst markup. (Is this allowed in the stdlib?)
I'm not sure if it is allowed, but it is certainly used plenty in some modules (perhaps those that started life as 3rd party packages).
(The output looks good enough for a first draft of a tkinter class/method reference, which I would like to work on.)
I won't stop you -- having *any* kind of docs for Tkinter sounds good to me!
> I understand that part of this [signature after class name] is due to the latter class having an __init__ with a reasonable docstring
If dropbox.client is written in Python, as I presume,
It is.
then I strongly suspect that the signature part of class dropbox.client.DropboxClient( oauth2_access_token, locale=None, restclient=None) comes from an inspect module method that examines the function attributes other than .\_doc__.
Indeed.
If so, DropboxClient.__init__ docstring is irrelevant to the above. You could test by commenting it out and rerunning the doc build.
Yes.
The inspect methods do not work on C-coded functions (unless Argument Clinic has fixed this for 3.4), which is why signatures are put in the docstrings for C-coded objects. For C-coded classes, it is put in the class docstring rather than the class.__init__ docstring.
Perhaps it doesn't understand __new? namedtuple actually generates Python code for a class definition using a template and then uses exec() on the filled-in template; the template defines only __new though.
> but the fact remains that namedtuple's default docstring produces poorly-looking documentation.
'x.__init__(...) initializes x; see help(type(x)) for signature'
This is standard boilerplate for C-coded .__init.__doc. Raymond just copied it.
He didn't (it's not in the template). It is the dummy __init that tuple inherits from object (the docstring is in the __init wrapper in typeobject.c).
>>> int.__init.__doc 'x.__init(...) initializes x; see help(type(x)) for signature' >>> list.__init.__doc 'x.__init(...) initializes x; see help(type(x)) for signature'
I think we can now agree that docstrings other than the class docstring (used as a fallback) are not relevant to signature detection. And Raymond gave namedtuple classes the docstring needed as a fallback.
We are off-issue here, but idlelib.CallTips.getargspec() is also ignorant that it may need to look at .\_new. An object with a C-coded .__init and Python-coded .__new__ is new to new-style classes. The new inspect.signature function handles such properly. Starting with a namedtuple Point (without the default docstring):
>>> from inspect import signature
>>> str(signature(Point.__new__))
'(_cls, x, y)'
>>> str(signature(Point))
'(x, y)'
The second is what autodoc should use. I just opened bpo-19903 to update Idle to use signature.
It was never about signature detection for me -- what gave you that idea? I simply want to have the option to put individual docstrings on the properties generated by namedtuple.
I'll add my voice to those asking for a way to put docstrings on namedtuples. As it is, namedtuples get automatic docstrings that seem to me to be almost worse than none. Sphinx produces this:
class Key
Key(scope, user_id, block_scope_id, field_name)
__getnewargs__()
Return self as a plain tuple. Used by copy and pickle.
__repr__()
Return a nicely formatted representation string
block_scope_id None
Alias for field number 2
field_name None
Alias for field number 3
scope None
Alias for field number 0
user_id None
Alias for field number 1
Why are __getnewargs__
and __repr__
included at all, they aren't useful for API documentation. The individual property docstrings offer no new information over the summary at the top. I'd like namedtuple not to be so verbose where it has no useful information to offer. The one-line summary is all the information namedtuple has, so that is all it should include in the docstring:
class Key
Key(scope, user_id, block_scope_id, field_name)
Unhide this discussion.
A few quick thoughts:
FWIW, here's a proposed new classmethod that makes it possible to easily customize the field docstrings but without cluttering the API of the factory function:
@classmethod
def _set_docstrings(cls, **docstrings):
'''Customize the field docstrings
>>> Point = namedtuple('Point', ['x', 'y'])
>>> Point._set_docstrings(x = 'abscissa', y = 'ordinate')
'''
for fieldname, docstring in docstrings.items():
if fieldname not in cls._fields:
raise ValueError('Fieldname %r does not exist' % fieldname)
new_property = _property(getattr(cls, fieldname), doc=docstring)
setattr(cls, fieldname, new_property)
Note, nothing is needed for the main docstring since it is already writeable:
Point.__doc__ = '2-D Coordinate'
Here's a variant that builds on your code, but makes for a nicer API. Single-line docstrings can be passed along with the attribute name, and with namedtuple.with_docstrings(... all info required to build the class ...) from a user perspective the factory looks like a class method:
from functools import partial
from collections import namedtuple
def _with_docstrings(cls, typename, field_names_with_doc,
*, verbose=False, rename=False, doc=None):
field_names = []
field_docs = []
if isinstance(field_names_with_doc, str):
field_names_with_doc = [
line for line in field_names_with_doc.splitlines() if line.strip()]
for item in field_names_with_doc:
if isinstance(item, str):
item = item.split(None, 1)
if len(item) == 1:
[fieldname] = item
fielddoc = None
else:
fieldname, fielddoc = item
field_names.append(fieldname)
field_docs.append(fielddoc)
nt = cls(typename, field_names, verbose=verbose, rename=rename)
for fieldname, fielddoc in zip(field_names, field_docs):
if fielddoc is not None:
new_property = property(getattr(nt, fieldname), doc=fielddoc)
setattr(nt, fieldname, new_property)
if doc is not None:
nt.__doc__ = doc
return nt
namedtuple.with_docstrings = partial(_with_docstrings, namedtuple)
if __name__ == "__main__":
Point = namedtuple.with_docstrings("Point", "x abscissa\ny ordinate")
Address = namedtuple.with_docstrings(
"Address",
"""
name Surname
first_name First name
city
email Email address
""")
Whatever = namedtuple.with_docstrings(
"Whatever",
[("foo", "doc for\n foo"),
("bar", "doc for bar"),
"baz"],
doc="""The Whatever class.
Example for a namedtuple with multiline docstrings for its attributes.""")
The need for this may be eliminated by bpo-24064. Then we change the docstrings just like any other object with no special rules or methods.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/rhettinger' closed_at =
created_at =
labels = ['type-feature', 'library']
title = 'Docstrings for namedtuple'
updated_at =
user = 'https://github.com/serhiy-storchaka'
```
bugs.python.org fields:
```python
activity =
actor = 'rhettinger'
assignee = 'rhettinger'
closed = True
closed_date =
closer = 'rhettinger'
components = ['Library (Lib)']
creation =
creator = 'serhiy.storchaka'
dependencies = []
files = ['28294', '28295']
hgrepos = []
issue_num = 16669
keywords = ['patch']
message_count = 21.0
messages = ['177381', '177393', '177418', '177434', '177470', '177560', '177577', '177592', '205249', '205269', '205271', '205277', '205317', '205340', '205341', '205582', '205583', '205978', '242096', '242106', '242121']
nosy_count = 10.0
nosy_names = ['gvanrossum', 'rhettinger', 'terry.reedy', 'peter.otten', 'giampaolo.rodola', 'nedbat', 'eric.snow', 'serhiy.storchaka', 'pconnell', 'Ankur.Ankan']
pr_nums = []
priority = 'low'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue16669'
versions = ['Python 3.5']
```