Better description of docstrings in the developer guide

videlec commented 9 years ago

The part of the developer guide that describes the docstring of classes and functions should be updated. For example a function with no input or output does not need an INPUT or OUTPUT block.

CC: @tscrim @jm58660

Component: documentation

Branch/Commit: u/jmantysalo/19041 @ d01d36d

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/19041

videlec commented 9 years ago

Commit: 3ad4912

videlec commented 9 years ago

New commits:

`e3ef3a8`	`Trac 19041: about docstrings in the developer guide`
`3ad4912`	`Trac 19041: remove trailing whitespaces`

videlec commented 9 years ago

Branch: u/vdelecroix/19041

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

comment:2

Looks good, but there is a typo in 'mentionned'.

What would you think of splitting it into 'INPUT' and 'OUTPUT' entries? Looks weird to have the two simultaneously when all others are listed block per block.

I can do it if you don't want to.

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

Reviewer: Nathann Cohen

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:4

Replying to @nathanncohen:

What would you think of splitting it into 'INPUT' and 'OUTPUT' entries?

+1 to this from me. But should we wait if others complain at sage-devel? Some might argue for example that in Python the function signature shows if there is no input at all, but output can not be seen directly from it.

And when we are at this, maybe take few days to think...? Maybe we should add for example a paragraph on assumptions? What about self - I have tried to remove them from docstring.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5165797 Trac 19041: reviewer comments 1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 3ad4912 to 5165797

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

comment:6

+1 to this from me. But should we wait if others complain at sage-devel? Some might argue for example that in Python the function signature shows if there is no input at all, but output can not be seen directly from it.

How would that be a problem with respect to what you wrote?

I am getting bored, and remembered that we had no 'default' SAT solver in Sage, so I do the dishonest job of writing a LP SAT solver, that just solves the LP, so that we get a default one.

So doing, I implement a function called 'nvars'. I'd say that what it returns is pretty obvious, and already made clear by the first line of the description. In this situation, function signature or not, I do not think that a 'OUTPUT' block is needed.

And when we are at this, maybe take few days to think...? Maybe we should add for example a paragraph on assumptions? What about self - I have tried to remove them from docstring.

We can wait if you want, I do not mind. Consider the commits up to now as 'positively reviewed' on my behalf.

Nathann

videlec commented 9 years ago

comment:7

Replying to @nathanncohen:

+1 to this from me. But should we wait if others complain at sage-devel? Some might argue for example that in Python the function signature shows if there is no input at all, but output can not be seen directly from it.

How would that be a problem with respect to what you wrote?

I am getting bored, and remembered that we had no 'default' SAT solver in Sage, so I do the dishonest job of writing a LP SAT solver, that just solves the LP, so that we get a default one.

So doing, I implement a function called 'nvars'. I'd say that what it returns is pretty obvious, and already made clear by the first line of the description. In this situation, function signature or not, I do not think that a 'OUTPUT' block is needed.

And when we are at this, maybe take few days to think...? Maybe we should add for example a paragraph on assumptions? What about self - I have tried to remove them from docstring.

We can wait if you want, I do not mind. Consider the commits up to now as 'positively reviewed' on my behalf.

Jori, if you want to add a commit about documenting self, you are free to do it. And it is good for me to wait a bit if you like it better.

jhpalmieri commented 9 years ago

comment:8

I agree that the OUTPUT block can sometimes be omitted. I've seen docstrings that start like this:

def jones_polynomial(self):
    """
    Return the Jones polynomial of this link.

    OUTPUT:

    - the Jones polynomial of this link

and I think we can safely omit the OUTPUT block in such cases.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:9

Propably even majority of functions do not need explicit OUTPUT section. When one-sentence description is of form "Return xyzzy of the foo", and docstring continues to explain what is xyzzy, it is enought for me.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

Changed branch from u/vdelecroix/19041 to u/jmantysalo/19041

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

Changed commit from 5165797 to a4e9dd4

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:11

Here are my suggestions, including one addition to the reviewer checklist.

I suggest not to mark this to positive_review until, say, wednesday. Then others have at least few days time to make comments.

New commits:

a4e9dd4 Few additions to developer manual.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

Changed author from Vincent Delecroix to Vincent Delecroix, Jori Mäntysalo

seblabbe commented 9 years ago

comment:12

in src/doc/en/developer/coding_basics.rst, you function -> your function.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from a4e9dd4 to d01d36d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d01d36d Typo correction: add 'r'.

dkrenn commented 9 years ago

comment:14

Replying to @jhpalmieri:

I agree that the OUTPUT block can sometimes be omitted. I've seen docstrings that start like this:
def jones_polynomial(self):
    """
    Return the Jones polynomial of this link.

    OUTPUT:

    - the Jones polynomial of this link
and I think we can safely omit the OUTPUT block in such cases.

But, shouldn't the output block give other information like a hint on the returned type or parent, e.g.

    OUTPUT:

    A polynomial over `\QQ`

or something similar?

dkrenn commented 9 years ago

comment:15

Do not refer to ``self``. Instead talk about "the object".

What do you think about mentioning the usage of "this object" / "this whatever"?

jhpalmieri commented 9 years ago

comment:16

Replying to @dkrenn:

Replying to @jhpalmieri:
I agree that the OUTPUT block can sometimes be omitted. I've seen docstrings that start like this:
def jones_polynomial(self):
    """
    Return the Jones polynomial of this link.

    OUTPUT:

    - the Jones polynomial of this link
and I think we can safely omit the OUTPUT block in such cases.
But, shouldn't the output block give other information like a hint on the returned type or parent, e.g.
    OUTPUT:

    A polynomial over `\QQ`
or something similar?

You're right, if there is some ambiguity about the type, then that should certainly be clarified, and the OUTPUT block is a good place to do that.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:17

Now it is wednesday. Should we go with this?

I think that "this object" is OK.

For example maximal_antichains() in posets returns just a list, whereas antichains() returns more complex data structure. The latter one has explicit output type (but at #18941 I suggest a slight modification to it).

It could be argued that maximal_antichains() should also say that the return type is a list (of lists). But should a user looking at example guess from [[0], [1, 2], [1, 3], [4]] the type? I think yes, but this can be argued.

dkrenn commented 9 years ago

comment:18

INPUT- and OUTPUT-block

This block is mandatory only if the input is not clear from the description.

This sounds very vague and is in very contrast to what was there before, where it explicitly was mentioned: "This is not optional.". To me this does not feel like you should or are encouraged to include such a block, but more like "include it if you want, but your nice description will handle everything as well...".

From a the side of a user: Often someone just wants to look up what exactly are the input-possibilites (type/parent) or which output (again e.g. type/parent) is returned. Then it is hard, if there is a parameter not in the INPUT/OUTPUT block.

I think having INPUT/OUTPUT blocks for all functions (that have input/output; but I wouldn't mind including it everywhere) improved the documentation of SageMath a lot and I don't want this to stop.

Is this a grammatically correct sentence:

The OUTPUT block can also mentionned the errors that the function can possibly raise.

dkrenn commented 9 years ago

comment:19

Replying to @jm58660:

It could be argued that maximal_antichains() should also say that the return type is a list (of lists). But should a user looking at example guess from [[0], [1, 2], [1, 3], [4]] the type? I think yes, but this can be argued.

OUTPUT:

A list (of lists).

should definitly be mentioned there.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:20

A side-question: Do we have common view about checking string options, so that for example algorithm='cat-says-meow' should raise an exception and never drop to some kind of default like algorithm='foo'? That is my suggestion to the reviewer checklist. (Many functions do check other arguments, but not these.)

I can live with many kinds of conventions --- but only one for one software. So it's OK for me to always have INPUT and OUTPUT. Or to have rules "Exclude INPUT section if the function has no arguments at all, or if the only argument is self in a function inside a class." and "Exclude OUTPUT section if the function returns nothing."

What about .cardinality() with one-sentence description "Return the number of elements in the poset."?

dkrenn commented 9 years ago

comment:21

Replying to @jm58660:

A side-question: Do we have common view about checking string options, so that for example algorithm='cat-says-meow' should raise an exception and never drop to some kind of default like algorithm='foo'? That is my suggestion to the reviewer checklist. (Many functions do check other arguments, but not these.)

+1 for raising an exception.

I can live with many kinds of conventions --- but only one for one software. So it's OK for me to always have INPUT and OUTPUT. Or to have rules "Exclude INPUT section if the function has no arguments at all, or if the only argument is self in a function inside a class." and "Exclude OUTPUT section if the function returns nothing."

+1

What about .cardinality() with one-sentence description "Return the number of elements in the poset."?

Oneline is fine and I could imagine to skip the input block and would include the following output block:

OUTPUT:

A :mod:`Sage Integer <sage.rings.integer_ring>` or :mod:`oo <sage.rings.infinity>`.

(Maybe use other hyperlinktargets; maybe skip oo if this cannot happen; maybe only Integer (instead of Sage Integer.) In this way the user gets some more information. (Although we developers know that .cardinality always returns this type of output; this may not be obvious/clear/known to arbitrary users.)

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:22

Best possible documentation is of course to have function names etc. so clear that the user does not need documentation. But let's assume that we still have (see #18959) poset width() returning a Sage Integer and height() returning a python int. Should we then also document this?

If so, then the only functions to return something and not having OUTPUT part would be boolean ones. Those are named is_* or has_*. For example has_top() has oneliner "Return True if the poset has a unique maximal element, and False otherwise."

dkrenn commented 9 years ago

comment:23

Replying to @jm58660:

Best possible documentation is of course to have function names etc. so clear that the user does not need documentation. But let's assume that we still have (see #18959) poset width() returning a Sage Integer and height() returning a python int. Should we then also document this?

I am not fully up-to-date with this discussion, but IMHO, yes, it should be documented.

If so, then the only functions to return something and not having OUTPUT part would be boolean ones. Those are named is_* or has_*. For example has_top() has oneliner "Return True if the poset has a unique maximal element, and False otherwise."

I sometimes do it like this:

Return if this poset has a unique maximal element.

OUTPUT:

A boolean.

One could replace A boolean by ``True`` or ``False``.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:24

First of all, I am very glad to see someone interested in making better documentation, even when it comes to small things!

Should we have even manual for documenting "common special types" of functions? "If the function has no arguments and only returns True or False, then do like this: - -"?

What do you think about hamiltonian_cycle() and is_hamiltonian() on generic graphs vs. dimension() and it's certificate-option on posets? Those seems to be OK, but in general: when to have "factor" and when "is_prime(certificate=True)"?

dkrenn commented 9 years ago

comment:25

Replying to @jm58660:

First of all, I am very glad to see someone interested in making better documentation, even when it comes to small things!

:)

Should we have even manual for documenting "common special types" of functions? "If the function has no arguments and only returns True or False, then do like this: - -"?

You mean like short examples for some special cases...sounds like a good idea.

What do you think about hamiltonian_cycle() and is_hamiltonian() on generic graphs vs. dimension() and it's certificate-option on posets? Those seems to be OK, but in general: when to have "factor" and when "is_prime(certificate=True)"?

This is not that easy; I can see plus-points for both sides: Using such a flag means to have only one function for the desired functionality, which is usually a good thing. On the other hand, getting different a output type depending the input is sometimes not seen as good practice. In particular, the certificate-option does not give back the dimension, but only the certificate, which is kind of weird, since I would expect a method called dimension to return a dimension in any case...

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

comment:26

This is not that easy; I can see plus-points for both sides: Using such a flag means to have only one function for the desired functionality, which is usually a good thing. On the other hand, getting different a output type depending the input is sometimes not seen as good practice.

If you want to start changing these things, please add me in Cc of the tickets. I do not consider that such "variable" outputs are a problem, given that the output format is decided by the input: you get what you asked for, exactly as if you had called a different function.

Nathann

dkrenn commented 9 years ago

comment:27

Replying to @nathanncohen:

This is not that easy; I can see plus-points for both sides: Using such a flag means to have only one function for the desired functionality, which is usually a good thing. On the other hand, getting different a output type depending the input is sometimes not seen as good practice.

If you want to start changing these things, please add me in Cc of the tickets. I do not consider that such "variable" outputs are a problem, given that the output format is decided by the input: you get what you asked for, exactly as if you had called a different function.

Actually, I don't intend to. I know that there are different oppinions on this and I am fine as it is (as I said, this is not that easy...). (And I remember several discussions on these things; I think on sage-devel...). So, no need to worry ;)

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

comment:28

Actually, I don't intend to. I know that there are different oppinions on this and I am fine as it is

Thank you. I am afraid that the fashion is with defining 'standards', and standards awfully sound like 'everybody must do what I think is right'.

Nathann

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:29

To be exact: a certificate for dimension() is really a certificate, not the. (And it shows that the dimension is at most something, not that it is exactly that.)

For me a "standard" in this context means just "default". That is, a developer might choose to not apply it - but he or she should do it on purpose and explain why. And that means more unified user experience.

Now, let's think about "specially simple" functions. The simplest possible (really used) function is something without input at all and returning a boolean. Lets suppose that a foo is xyzzy or not. The name should be is_xyzzy(). If possible, should we explain the meaning in oneliner? (For example has_top = "Has unique maximal element".) Or always have the explanation in another paragraph? And do we use the format

"Return True if the foo is xyzzy, and False otherwise." (and no OUTPUT block at all)
"Test whether the foo is xyzzy." (OUTPUT maybe "A boolean" or "True or False)
"Return if the foo is xyzzy." (OUTPUT like former)
Something else?

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:30

Should we avoid technically exact term "iterable" and just say "list" instead? On posets is_linear_extension() says on INPUT block "l – a list (or iterable) containing - -". I think that it is easier for beginner to understand, and advanced user knows that Sage almost never checks for exact type.

dkrenn commented 9 years ago

comment:31

Replying to @jm58660:

Should we avoid technically exact term "iterable" and just say "list" instead?

IMHO, no. If an iterable is possible, then this should be stated. In particular since Python 3 uses a lot more iterables/iterators and thus iterable is a very basic Python vocabulary ;) Also writing something unprecise or incomplete is also not a good idea.

On posets is_linear_extension() says on INPUT block "l – a list (or iterable) containing - -". I think that it is easier for beginner to understand, and advanced user knows that Sage almost never checks for exact type.

I find this ("a list (or iterable)") good, since it says list (the beginner will know what this is), but also mentions iterable, so this it is clear that iterables are allowed and the advanced user and iterator-loving user will be happy :)

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

comment:32

Replying to @dkrenn:

On posets is_linear_extension() says on INPUT block "l – a list (or iterable) containing - -". I think that it is easier for beginner to understand, and advanced user knows that Sage almost never checks for exact type.

I find this ("a list (or iterable)") good, since it says list (the beginner will know what this is), but also mentions iterable, so this it is clear that iterables are allowed and the advanced user and iterator-loving user will be happy :)

OK. So maybe the function do_something(l) could have one line description in the index "do_something() - Do something to a given list.", then one line description in the beginning of the docstring like "Do something to the list l." and INPUT would be something like "- l - a list (or iterable) containing elements to be...".

(Btw, maybe I should mention that I am not a real mathematician. I am just a computer support, it admin etc. I have been years in the front line between the code and the user. That's why I think "How this can be understood wrong" and "Can I use a hour to save a minute for each of >60 users." and so on.)

fchapoton commented 9 years ago

comment:33

Does not apply.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago

comment:35

A test. Ignore.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago

comment:36

This is a longer test to see if trac just don't accept a string longer than few bytes. Please ignore.

A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text.

A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text.

A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text. A placeholder text.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago

comment:37

I would like to make a comment to comment number 33. It would be

True. But before merge: do we have common view of what to do? I.e. should I move this to sage-7.1 or wontfix?

but for some reason I can not write it as a comment.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago

comment:38

Still more test. An italic text in this.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago

comment:39

I think that there is no common view for referring to self in docstring: i.e. to say "Return foo number of the xyzzy." or "Return foo number of self."

Shall we try to document some common cases like functions returning a Boolean ("Test whether..." vs. "Return True if...") or certificate-option?

Or just close this ticket as wontfix?

f29946bc-ee7b-48cd-9abc-3445948c551d commented 7 years ago

Changed reviewer from Nathann Cohen to none

f29946bc-ee7b-48cd-9abc-3445948c551d commented 7 years ago

comment:40

Partly done in other tickets, no common view in this ticket. I suggest this one to be closed.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 7 years ago

comment:41

Travis, I suggest this one to be closed.

tscrim commented 7 years ago

comment:42

Sounds good to me.

tscrim commented 7 years ago

Changed author from Vincent Delecroix, Jori Mäntysalo to none

tscrim commented 7 years ago

Reviewer: Travis Scrimshaw

sagemath / sage

Better description of docstrings in the developer guide #19041