Open ayshih opened 4 years ago
If it's being stored anywhere it belongs in a nice rst doc file.
I am in favour of removing it. Even in most cases when a file started off as the work on one or two people it rarely stays like that for very long and then the comments never get updated.
I'm also in favor of removing it. At best it is redundant as authorship is already recorded in the git history and at worst it is inaccurate. It also has to be maintained by hand.
Whichever way we decide, I don't think we should do this on a subpackage level. It should be consistent across the entire code base.
Before we make a final decision though, I think we should get the opinions of those whose names we would be removing. Searching both __author__
and __authors__
, it looks like that's
If it's being stored anywhere it belongs in a nice rst doc file.
That doesn't seem practical in terms of format or maintenance. Authorship is naturally coupled with individual code files, so creating and maintaining a separate list of authorships is even more work.
Even in most cases when a file started off as the work on one or two people it rarely stays like that for very long and then the comments never get updated.
Certainly there are many code files where the authorship list would be impractical to be listed as individuals, and should just be "SunPy collaboration" or something like that. However, there are also parts of the code – \<cough> coordinates \<cough> – where it really is just a extremely limited set of contributors, and may always be.
At best it is redundant as authorship is already recorded in the git history and at worst it is inaccurate.
The "redundancy" argument probably bothers me the most. Git is a development tool, not a documentation-for-users tool. Also, Git history keeps track of editors, not of authors, and it arguably can be muddled at that too. If I autopep8
a file, I shouldn't gain authorship. If I re-order function definitions in a file, I shouldn't gain authorship.
It also has to be maintained by hand.
Okay, maybe this bothers me more. We already ask "a lot" for code contributions – PEP 8, changelog entries, etc. – so it doesn't seem like a huge burden to also include updates to authorship as needed. (Admittedly, email addresses can become obsolete, but Git history doesn't fix that.)
That doesn't seem practical in terms of format or maintenance. Authorship is naturally coupled with individual code files, so creating and maintaining a separate list of authorships is even more work.
Nor is it practical to add a new author each time a commit is made to a file or a collection of files.
If you want to acknowledge specific people due to their contribution on a file or a package, an acknowledge section in a docstring at the top of the file or package __init__
I would be ok with.
This more visible than __authors__
and will be in the docs as well so more people can see it.
The "redundancy" argument probably bothers me the most. Git is a development tool, not a documentation-for-users tool. Also, Git history keeps track of editors, not of authors, and it arguably can be muddled at that too.
How is __author__
documentation for users? It doesn't give them any useful information.
If I
autopep8
a file, I shouldn't gain authorship. If I re-order function definitions in a file, I shouldn't gain authorship.
Why? If someone makes a change to a file why exclude them? Why are their contributions so useless to not deserve authorship?
Okay, maybe this bothers me more. We already ask "a lot" for code contributions – PEP 8, changelog entries, etc. – so it doesn't seem like a huge burden to also include updates to authorship as needed. (Admittedly, email addresses can become obsolete, but Git history doesn't fix that.)
We (try) to ask contributors to do meaningful changes. I am not sure that this would fall into the same category.
I don't necessarily have objections to authorship information being removed, but I strongly dislike the offered justifications: that Git history is a competent substitute (it isn't), and that we're too lazy as maintainers (I'm not, at least).
I'd like to know whether there is consensus that authorship information is something we want to record on this project. If so, we can debate how (and I don't think Git should be the way). It doesn't have to be __author__
; docstrings are probably a better approach. But, any record of authorship would need to be maintained.
It could instead be the stance that authorship information should be intentionally excluded, in a more "we are one" approach. For example, the project could feel that the inevitable arguments about authorship updates – whether code changes are substantial enough to warrant gaining authorship or whether someone's contributions have been so completely replaced that he/she should lose authorship – are deleterious to the project.
Okay, I've ranted enough about this. Time to add this as a topic for the coordination meeting!
I don't necessarily have objections to authorship information being removed, but I strongly dislike the offered justifications: that Git history is a competent substitute (it isn't), and that we're too lazy as maintainers (I'm not, at least).
I don't disagree with you here. But the concept of authorship on a piece of code that sees maybe 20 people working on it, I don't think is clear cut enough to warrant inclusion within our code.
Personally I think if we want to say that someone has contributed to sunpy or a piece of code, we should acknowledge them but I don't think authorship is how we should go about that.
It could instead be the stance that authorship information should be intentionally excluded, in a more "we are one" approach. For example, the project could feel that the inevitable arguments about authorship updates – whether code changes are substantial enough to warrant gaining authorship or whether someone's contributions have been so completely replaced that he/she should lose authorship – are deleterious to the project.
This I think should be goal. There should only be the project and "we" are cogs of that project.
Fyi, I'm completely fine with either decision and leave it up to the current devs to decide :+1:
Late to the discussion, but here it's what I think.
The danger of keeping author information for the users (i.e., not via git) is that the users would be tempted to contact the author individually rather than through issues or mailing list. That would not be good! The developer contacted then should make the effort to actually report such issue upstream.
The other point, acknowledgement, is a tricky one. People need to be acknowledged for what they do! But, either we ease (and educate) how to do so or hardly people will check for that metadata, neither if included as doctstrings. I've been planning to test ImperialCollegeLondon/R2T2
which would extract all the citations of the pieces of software you use (if annotated), but that would acknowledge (normally) the algorithm and not the implementation. I imagine we could add something similar but for acknowledging the implementation if we want to do so.
If this information is to be kept to show how much I did, then I believe the git history is the way to go. Any of the developers can generate such a list and show it on their website if they wish to do so. But I would be cautious as to limit to that metric only. There's a lot of work that's done that's not reflected as commits (code review, community interaction, mailing list discussions, ...).
Coming back round to this, I'm +1 for removing authorship for reasonns already given above.
I still feel this way:
I don't necessarily have objections to authorship information being removed, but I strongly dislike the offered justifications: that Git history is a competent substitute (it isn't), and that we're too lazy as maintainers (I'm not, at least).
I'd like to know whether there is consensus that authorship information is something we want to record on this project. If so, we can debate how (and I don't think Git should be the way). It doesn't have to be
__author__
; docstrings are probably a better approach. But, any record of authorship would need to be maintained.
Do you think authorship information is something we want to record then?
Do you think authorship information is something we want to record then?
Yes, I think it has value. Of course, if the authorship is tracked per file, there are indisputably files that are too insane – I'm looking at you, mapbase.py
– to be credited more finely than "SunPy developers".
To quote myself again:
It could instead be the stance that authorship information should be intentionally excluded, in a more "we are one" approach.
If that is agreed to be the project stance, it should be explicitly documented.
How should we come to a decision on this then? It looks like @nabobalis, @dstansby, @wtbarnes, @Cadair and possibly @dpshelio are in favour of removing it and @ayshih in favour of keeping it. @ayshih would you be happy in that being enough of a majority to decide and document that
authorship information should be intentionally excluded, in a more "we are one" approach.
?
Maybe not "happy", but I'll certainly accept a decision made by the group as long as it's not poorly justified.
this has come up again on the community call - I think the consensus of @wafels @nabobalis @wtbarnes and myself are that they should probably go
maybe a page on the docs of a "thanks to" or emeritus contributors section
My stance hasn't changed, so I'll simply reiterate that I want the justification for the decision to be rooted in aspiration (running towards a "we are one" philosophy) rather than in fear (running away from the burden of authorship deliberation and maintenance).
The authorship of this package is given as the sunpy community, the same goes for any publication.
This should be extended to the files that have the __author__
.
How about we add something to the dev guide along the lines of
Given the wide array of contributions from many authors over a number of years, the "author" of the
sunpy
package should be regarded as the "The SunPy Community" rather than any one individual. As such, the__author__
and__email__
module level dunder names should not be included in any source file within thesunpy
package.
If we wanted, we could even draft an SEP with similar language.
this has come up again on the community call - I think the consensus of @wafels @nabobalis @wtbarnes and myself are that they should probably go
I'll add my name to this list, running bravely towards "we are one"
Let's have a discussion about whether to remove authorship information from the code itself (e.g.,
__author__
and__email__
). (See triggering comment thread.)Some reasons to remove it:
Some reasons to keep it:
sunpy
as a downloaded package doesn't have its Git history, so no authorship information would be available locally via Git.If we do choose to keep this authorship information, it needs to be maintained. Perhaps this decision could be made on a per-subpackage basis, so that subpackage maintainers can choose to accept that responsibility?
Discuss.