manaakiwhenua / manaakiwhenua-manifesto

A manifesto and code publishing framework for the Manaaki Whenua - Landcare Research GitHub account
Creative Commons Attribution 4.0 International
1 stars 0 forks source link

Personal repositories -> institutional repositories #8

Open tretherington opened 3 years ago

tretherington commented 3 years ago

How should we deal with a situation when someone has developed/published a personal repository, perhaps before beginning work at an institution, but then wants to develop that repository further.

bhjolly commented 3 years ago

Probably continue to use the personal repo, especially if others have already cloned it. MWLR doesn't own work you did before starting at MWLR.

tretherington commented 3 years ago

👋 Ben! But the institution will own the new developments I guess?

If the institutional copyright date + name is added to the license - along with the original authors personally copyright date + name - then that marks the dual ownership situation, so I think I'm happy with that.

But from a branding/advertising perspective it might be good for someone to see that the institution (via the employee's activity) is coding on a particular topic within the institutional GitHub account. If the personal account was used then this activity would be hidden from an institutional perspective, which might be important in some contexts.

I guess the original personal repo could remain as the main reference point (especially if it has already been published in a package service like CRAN or PyPI, or in a software paper, and so users are used to looking there for the authoritative version) and an institutional fork could then be a place to demonstrate development, with pulls back to the personal repo when required?

That might satisfy a long term consistency in where a project lives, but also allows an institution to demonstrate/track activity in a certain topic.

As I think about this more, this is indeed what would happen if the project wasn't controlled by a new staff member, but it just seems a bit weird to be forking one of my personal repos under the MWLR account, to then develop, and then raise a pull request back to myself in the personal account! 😜

bhjolly commented 3 years ago

I guess so, as much as any organisation 'owns' their employees contributions to open source software?

The idea of forking is interesting. It could create a weird situation where the authoritative source is not under active development so contributors would have to make a decision about which repo to develop for. It could also hide individual contributions as well? Not sure exactly how that would work. I tried to have a quick look at 'GitItBack' (https://www.netguru.com/gititback) which supposedly tracks organisational contributions through the GitHub API (assuming via developer affiliations) but the page wouldn't load for me so couldn't test it.

brandonnodnarb commented 3 years ago

@tretherington, w.r.t.

I guess the original personal repo could remain as the main reference point (especially if it has already been published in a package service like CRAN or PyPI, or in a software paper, and so users are used to looking there for the authoritative version) and an institutional fork could then be a place to demonstrate development, with pulls back to the personal repo when required?

I can see how leaving the original repo for posterity is necessary (or good practice). Perhaps I'm oversimplifying, but leaving a small piece of text on the original repo notifying users it is stale (ok, archived) but the project is now actively developed on [link to institutional repo]? That happens all the time as projects move --- for example from sourceforge to git (and so on).

bhjolly commented 3 years ago

I think we need some clear guidance from MW legal over what claim MW wants to make over employee contributions to OSS. I think that an OSS project that exists prior to an employee joining MW should not fall under MW control. MW can choose to contribute to that project via employees, but they don't own the project. One alternative could be to require MW employees to maintain a MW GitHub account for contributions made while working here, though for developers who like to maintain a portfolio that could be a bit tricky.

I think a lot of this would depend on the project, the individual, and the repo license. I would definitely recommend forking over migration, unless its a very small project about to receive a massive MW-funded upgrade. If we want to attract scientists/developers and maintain a good reputation in the NZ science community then taking people's repos off them is probably a bad idea.

alpha-beta-soup commented 3 years ago

I think this needs to be situational. Hiring someone because they are a/the core contributor to an open source software project in order to pay them to keep working on that project is quite different to someone who has made contributions to some project that is useful to them (e.g. as a generic tool like numpy or GDAL), and then continues to make some contributions to that on MWLR time, even if those contributions are significant.

The options seem to be:

  1. Migrate (akin to a svn → git migration) — orphaning the original and possibly even removing the history
  2. Fork (personal/core repo → MWLR repo) — with the potential for PRs back to the core repo
  3. Require MWLR-time contributions under a purpose-made MWRL organisational-personal Github account
  4. Do nothing

1 is pretty drastic, but if the license needed changing it would likely be the best option.

2 is particularly good if the MWLR fork needs to diverge from the core in some strong way, and has the advantage of drawing people to https://github.com/manaakiwhenua and (when there are PRs back) without disrupting others while getting community peer review. It also enables contributions from the community to be incorporated into the MWLR fork, which is an "open core" practice that MWLR may be interested in.

3 is not a terrible option, and can also help with a clear legal/professional separation between what someone does in their own time, and on MWLR time, if that is a reputational concern to MWLR.

4 Seems OK too, assuming someone isn't selling directly-competitive commercial services off their library on the side (which would probably be obvious), I don't expect this general situation is likely to be problematic in practice. MWLR would still benefit from its time investment in the open source project, as would others, by its use without the promotional interest. Perhaps an option 4a would be that people using personal Gtihub accounts are encouraged or required to include a @manaakiwhenua in their Github bio/status thing. Example, https://github.com/mbostock :

Screenshot from 2020-12-17 10-54-48

apawlik commented 3 years ago

Hi all, Thank you for starting the discussion and all insights. There is also a question (which I have been pondering about for a while) of what to do with repositories which sit within personal accounts but have started when the person was already at MWLR and the code in the repository is a part of what they do (which is probably a fairly common case since organisational GitHub was set up only about a year ago). In all cases (the above and the case as you have been discussing), I think encouragement than requirement is a better approach. I would definitely like to consult with the lawyers on that and come out with a couple of options to set up best practice.

From the 'technical' point of view there is also a possibility of transferring the repository which preserves all history, contributions and doesn't break forks.

In my opinion, the main question when it comes to transfer or migration (less when it comes to fork) is to ensure that wherever the code will sit will still enable the contributors to showcase their portfolio.

tretherington commented 3 years ago

There looks like we're not the only people battling with this sort of thing, and this paper makes for interesting reading:

Alami A, Wasowski A. 2019. Affiliated participation in open source communities. PeerJ Preprints 7:e27827v1

It's a completely different domain, and not exactly what we are talking about, but the comments around IP concerns and undefined processes and policies seems particualrly relevant.