Closed rosiealice closed 6 years ago
If we're trying make a splash with the release I'd wait until mid September when people will be more likely to be paying attention. Something along the lines of never launch a new product in August...
However the timing of a soft release to those with a declared interest probably would not matter. I totally agree with the open community approach and effort to foster the broad participation you are advocating. It might be valuable to reach out to this smaller group first and learn from them what they'd like to see...
The only potential issue with your suggested outreach that I can see is that if you open up the existing modeling calls and they become too large and diverse the NGEE modeling team will loose a valuable forum for frank discussion and progress.
Alistair
Sent from my iPhone
On Jul 28, 2017, at 4:27 PM, rosiealice notifications@github.com<mailto:notifications@github.com> wrote:
Our plan for releasing FATES is that it will become a community model, adopted by a broad user base, and so I thought now would be a good time to give a little thought to how to help that along the way. The more people are interested in, use and develop FATES, the better it will be, the higher profile our papers will be, and the more widely it will be adopted, etc. etc.
For those of us invested in FATES already, the benefits of more participation, IMHO, far outweigh the potential costs. Those costs are typically 1) too many cooks spoiling the broth* and 2) people treading on each others toes. Both are primarily avoidable by good communication, I have found, for which this github thing is ideally suited.
This issue is intended to discuss details of potential users, discuss access, roadblocks, etc. I think the first step is to compile a list of individuals who may be interested in using FATES, from prior conversations, hunches you might have, etc. Then we can think about how to actively engage with these folks.
Possible outreach activites might include: -Introductory targeted email, indicating location of github repo, signup details, plus synopsis of ongoing activities. -Opening up existing NGEET modelling calls externally -Telecon tutorials on how to use FATES -Identification of funding opportunities to enable FATES development & applications -Ultimately, maybe, some sort of tutorial workshop?
My initial list of people who I suspect might be (even tenously) interested is as follows: -Quinn Thomas (land use and N cycling perspective) -Abby Swann & group (mortality feedbacks in CESM, etc.) -Jeremy Lichstein (myriad of classical ecology problems) -Belinda Medlyn (mentioned interest to me at Exeter New Phyt meeting) -Brendan Rogers (has long-standing interests in demography/fire in CESM) -Bill Hoffman (fire-veg interactions) -Dave Moore, Andy Fox, Mike Dietze (PalEON project) -Tara Hudiberg, Bev Law (Western US forests) -Polly Buotte (Beetle kill, more Western US forests) -Josh Fisher (various tropical applications) -Peter Lawrence (land use problems)
So, my questions are; has anyone else indicated interest in using the model? Have I forgetten anyone? How should we reach out to them? Is now a good time? How can we increase visibility to reach more people?
Tagging @ckovenhttps://github.com/ckoven, @rgknoxhttps://github.com/rgknox, @lmkueppershttps://github.com/lmkueppers, @jqchambershttps://github.com/jqchambers, @tompowell9https://github.com/tompowell9, @xuchonganghttps://github.com/xuchongang, @jenniferholmhttps://github.com/jenniferholm, @bchristohttps://github.com/bchristo, @mdietzehttps://github.com/mdietze, @climate-dudehttps://github.com/climate-dude, @serbinshhttps://github.com/serbinsh, @alistairrogershttps://github.com/alistairrogers, @walkeranthonyphttps://github.com/walkeranthonyp , @dlawrenncarhttps://github.com/dlawrenncar, @kvcalvinhttps://github.com/kvcalvin, @aswannhttps://github.com/aswann, @mpaiaohttps://github.com/mpaiao for ideas!
Thanks in advance for any input :)
-Rosie
*see https://www.youtube.com/watch?v=JibxHpXqAfc
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/NGEET/fates/issues/257, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ANTfrc9Knqbi50nr7H5oCtpsQHv1qXoGks5sSkQogaJpZM4OnC8G.
In response to @alistairrogers concerns, and as someone on the cusp between inside and outside the team, I'd say that open up the code base to be public doesn't imply opening up the telecon, meetings, etc. But what it does imply is that you have some mechanism for how to handle Github issues raised by the community (bug reports, feature requests, general discussions) and for deciding on whether to accept pull requests (PR) from the outside community. Github has greatly facilitated all of this for our team, but PEcAn is very modular making it easy for us to accept community PR.
Community bug reporting will be great, but it's also quite likely that FATES could start seeing new 'features' being added by other teams that could result in a model that could explode in complexity (additional processes, additional flags, additional parameters, etc.). I've also seen, from my time working with the ED2 community, how a 'fix' in one biome can cause another to go wonky. This is why my team has been spending a lot of time thinking about both automated data-based benchmarks (in addition to model execution benchmarks, which I know you already have) and statistical parsimony. In other words, a new feature needs not only to reduce residual error (any 'fix' that breaks more than it fixes is an obvious reject) but also that reduction in residual error more than offset the inevitable increase in parameter error. I don't think you need to have all of this in place to make the code public, but I think it's important to have discussed how new code gets in.
Finally, if you do want to do a Sept release, I'd really push for before Sept 15th, since that's the date scheduled for the next PEcAn release and we're really eager to have a public version of FATES in there.
I think it's important to have discussed how new code gets in. open up the code base to be public doesn't imply opening up the telecon, meetings, etc.
Second that. Have a CONTRIBUTING.md
document in the repo may be useful in this regard, both as documentation for would-be contributors and to force you to think through some of these things ahead of time. See https://github.com/blog/1184-contributing-guidelines.
Also, make it clear in your README that contributions are welcome and perhaps point people to specific issue numbers you'd welcome community contributions to?
How can we increase visibility to reach more people?
Thoughts: take advantage of GitHub Pages, document WHY this is a desirable model to use, provide quick links to publications, publicize on Twitter, provide good user-facing tools and diagnostics. Point to PEcAn integration as a great way to test and contribute ( @mdietze ?).
Thanks for the feedback @alistairrogers, @mdietze and @bpbond. Great points all around. I think adding a CONTRIBUTING.md sounds like a win win. I will make an issue to add one.
@mdietze, I've been playing around with some ideas regarding continuous integration, which would at least implement build testing upon PRs, and possibly lightweight run/regression testing on AWS EC2 (which I know you guys use in PEcAn). It is hard to anticipate how much of a volume spike in PRs we would anticipate with "public" interaction, but I do believe continuous integration features (implemented correctly) could help.
So, I guess that the release code would be public, (implying increases in bug reports, etc.) but, analogous to the CLM, getting access to the developer repo will still require asking nicely for access, and so PR's would be limited to the set of people on the developer repo. In the case where we have lots of contributions, we will most likely need to explicitly set priorities for integration, and also increase our SE resources. But that would be better than the alternative...
getting access to the developer repo will still require asking nicely for access, and so PR's would be limited to the set of people on the developer repo.
With all respect, there seems to me to be a big contradiction between this and your goal of becoming a "community model". Why would you not have a single repo, with write access limited to a small group obviously, but open for cloning/PRing by anyone? In other words, consider following the PEcAn approach, not the CLM one.
I'm going to second @bpbond's suggestion here -- please don't hide the 'live' code in a private repository and restrict the rest of the world to getting a new tarball every few years. In addition to a public repository making it more of a true 'community' model, if the repo is private no one will be able to pull bugfixes into their branches. Also, when code improvements can't be pulled in frequently, the resulting contributions (when they do come) become huge and break everything, or result in many divergent forks (both these problems plague the ED2 code base).
In terms of the effort required, my experience is that PR follow a pretty exponential distribution (e.g. https://github.com/PecanProject/pecan/graphs/contributors), with folks outside the core team rarely being anywhere except in the tails (and those few who are not are extremely valuable!). While I was the one who raised the issue of needing to have a policy about outside PR, I've never found it burdensome. Furthermore, adding a basic 'lightweight' continuous integration (e.g. does it compile and complete a very simple test run) means that you'll know quickly if anyone's submitting code that definitely breaks things even before you run it through your more extensive formal test suite.
I think I feel open to the idea of making NGEET/fates public readable. My only reservation, is that I want to protect people who are writing cutting edge modules to get their manuscripts going before we release that code to the wild. But I think that in reality, opening up to public access really doesn't generate more scooping.
The other thing to consider is that we have such an incredibly low bar of entry as it is, we really haven't turned people away yet. I wonder if we could implement a way to automatically authenticate new users via web-portal, where they click the "i agree" button and are given read-only privileges.
@rgknox talk to Alice Bertini in CGD/CSEG for CESM she's put together something that's somewhat along the lines of what you are talking about. In CESM we've had a mechanism for having people agree to policies and then get them an SVN password. She's got an update to that form that includes getting their github username. And she's putting together something that will update github with their account information and put them on the appropriate teams and such.
See
https://csegweb.cgd.ucar.edu/svnuser/cgi-bin/update_access_form.cgi
Thanks @ekluzek, I think Alice and this mechanism would be a great resource if we start moving that direction
This will be my last comment, and then I'll leave your poor issue alone :)
But I think that in reality, opening up to public access really doesn't generate more scooping.
Yeah, I personally think most people drastically overestimate the likelihood of this happening. I've developed whole papers online, publishing data in real time and tweeting about it, and no one scooped us. I've given talks on in-prep syntheses, providing the the GitHub link in public forums, and no one has done so.
Not saying it's impossible, but rather vanishingly unlikely, especially given the nontrivial technical requirements of running FATES.
Thanks, Rosie, all, for discussing this. From the start we've had a commitment to making the FATES code publicly accessible. To me the practical challenge comes in deciding what goes into the trunk and who does that work and what gates would need to be passed. So long as the bulk of code contributions continue to come from/be handled by the core development team and priorities can be set by that core team (i.e., to meet their science objectives), then I don't see any major issues.
If, for a real example, Ryan is the primary person handling these broader contributions and they start to interfere with his funded priorities, it seems folks interested in more support for community activities would need to seek out those funds/personnel.
If we have the automated code of conduct/philosophy that folks sign up to when they access the repository, and have a statement about how developments could make their way to the trunk (without promises that they will), and, finally, track where code contributions are coming from/associated workload then it seems we can revisit as we get more experience.
Lara
On 8/3/17 12:31 PM, Ben Bond-Lamberty wrote:
This will be my last comment, and then I'll leave your poor issue alone :)
But I think that in reality, opening up to public access really doesn't generate more scooping.
Yeah, I personally think most people /drastically/ overestimate the likelihood of this happening. I've developed whole papers online, publishing data in real time http://iopscience.iop.org/article/10.1088/1748-9326/11/8/084004 and tweeting about it https://twitter.com/BenBondLamberty/status/671999991976730624, and no one scooped us. I've given talks on in-prep syntheses, providing the the GitHub link in public forums, and no one has done so.
Not saying it's impossible, but rather vanishingly unlikely, especially given the nontrivial technical requirements of running FATES.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NGEET/fates/issues/257#issuecomment-320067602, or mute the thread https://github.com/notifications/unsubscribe-auth/ASpkpRGKW8RKzm8NKHRBaejMub_e0WOzks5sUiAKgaJpZM4OnC8G.
-- Lara M. Kueppers Assistant Professor, Energy & Resources Group, UC Berkeley Faculty Scientist, Climate and Ecosystem Sciences Division, Berkeley Lab Assistant Research Scientist, Sierra Nevada Research Institute, UC Merced 510.486.5813 o.
It would be helpful to have some sort of introduction to running the code. Either a written document (on the wiki?), video conference, or something similar to get users started. It's possible you have this already but it wasn't obvious to me how to find it on this site (I didn't look carefully though).
Abby
We also have these access guidelines already: https://github.com/NGEET/fates/wiki/Use-and-distribution-policy-for-fates-developer-repository
but as far as I understand, no mechanism for people to explicitly sign up to them...
FWIW, I think these guidelines are pretty robust at protecting contributors, and have directly been employed in the past to do so. It's also much easier to persuade people to contribute if there is some method to reassure them that their work won't be scooped. This function is perhaps more useful than the IP protection itself.
Notes from call on 10 August.
Thoughts on repo access policy @ckoven noted that most people running FATES will need access to the CLM or ALM repos, both of which have developer access agreements in the first place. If we could make the process slick and easy it wouldn't deter people contributing, and would protect contributors per the point above.
Additions to the list of possible collaborators: Gil Bohrer Ashley Matheny Pierre Gentine Peter Horvath (Norway)
Ideas for outreach: Having an in-person training session as an extra day after the winter LMWG meeting 2018 Following up on @aswann 's ideas above, including an updated wiki on running the code.
Danica tells me that hui.tang@geo.uio.no is also interested in FATES... As is Ashley Matheny at UT Austin.
In just noticed that on the public FATES release, where we might in principle direct people, there is no obvious sign that we also have a development repo, nor instructions on how to get access to it. I think it's important to message that there is a lot of work going on behind the scenes and that we want people to join in? Should we amend this? Have I missed it?
And there is a public NCAR/fates-release repo as well.We probably should make sure those public ones have a wiki that points out about the private repo.
And note that since these "-release" ones are public anything in them can be "scooped" by someone in terms of writing papers or taking intellectual property. We have to do it this way in order to have a reasonable way to interface with subversion. The subversion repository requires you to accept specific terms -- but the "fates-release" repo's do NOT. So although there is protection for people signed up for the CESM repo -- there isn't that protection for people who happen upon "fates-release".
Also, if you want people to "join in", is there really a reason that "fates" is private? We could get rid of NGEET/fates-release if fates were public. We probably still need the NCAR one, at least until NCAR converts CLM to github. The advantage of having fates private and NGEET/fates-release public, is that you hide the complexity, the work, and the bugs in the private "fates" version (essentially hiding some of the dirty laundry in creating fates). But, if you don't look at that as a feature -- and instead look at it as a bug -- why are we doing it?
Since we've now had the FATES tutorial and release, have a contributing.md, have the tutorial documents posted online, and have worked out every issue in this thread other than making the repo public, and since that last point is now being discussed in #332, I'm closing this older thread.
Our plan for releasing FATES is that it will become a community model, adopted by a broad user base, and so I thought now would be a good time to give a little thought to how to help that along the way. The more people are interested in, use and develop FATES, the better it will be, the higher profile our papers will be, and the more widely it will be adopted, etc. etc.
For those of us invested in FATES already, the benefits of more participation, IMHO, far outweigh the potential costs. Those costs are typically 1) too many cooks spoiling the broth* and 2) people treading on each others toes. Both are primarily avoidable by good communication, I have found, for which this github thing is ideally suited.
This issue is intended to discuss details of potential users, discuss access, roadblocks, etc. I think the first step is to compile a list of individuals who may be interested in using FATES, from prior conversations, hunches you might have, etc. Then we can think about how to actively engage with these folks.
Possible outreach activites might include: -Introductory targeted email, indicating location of github repo, signup details, plus synopsis of ongoing activities. -Opening up existing NGEET modelling calls externally -Telecon tutorials on how to use FATES -Identification of funding opportunities to enable FATES development & applications -Ultimately, maybe, some sort of tutorial workshop?
My initial list of people who I suspect might be (even tenously) interested is as follows: -Quinn Thomas (land use and N cycling perspective) -Abby Swann & group (mortality feedbacks in CESM, etc.) -Jeremy Lichstein (myriad of classical ecology problems) -Belinda Medlyn (mentioned interest to me at Exeter New Phyt meeting) -Brendan Rogers (has long-standing interests in demography/fire in CESM) -Bill Hoffman (fire-veg interactions) -Dave Moore, Andy Fox, Mike Dietze (PalEON project) -Tara Hudiberg, Bev Law (Western US forests) -Polly Buotte (Beetle kill, more Western US forests) -Josh Fisher (various tropical applications) -Peter Lawrence (land use problems)
So, my questions are; has anyone else indicated interest in using the model? Have I forgetten anyone? How should we reach out to them? Is now a good time? How can we increase visibility to reach more people?
Tagging @ckoven, @rgknox, @lmkueppers, @jqchambers, @tompowell9, @xuchongang, @jenniferholm, @bchristo, @mdietze, @climate-dude, @serbinsh, @alistairrogers, @walkeranthonyp , @dlawrenncar, @kvcalvin, @aswann, @mpaiao for ideas!
Thanks in advance for any input :)
-Rosie
*see https://www.youtube.com/watch?v=JibxHpXqAfc