Open shaunagm opened 7 years ago
@shaunagm thanks so much for this thoughtful and detailed analysis! I really enjoyed reading through this and think you raise important points. And this book overall seems really interesting...I'm a fan of anthropology and human-centric writing :)
One small note, about VCs not being quoted in the paper- Mark Suster from Upfront is quoted about the rise of open source (actually one of my favorite quotes!) but not about VCs investing directly into open source. After talking to a lot of folks, I'm a bit skeptical about the future of VC + open source and didn't think it was the most important perspective to include, given all the other material competing for attention.
This issue contains by notes for item 1 of the annotated bibliography. When they're a bit more coherent, I'll add them to the bibliography. Please feel very free to add your own comments.
A note: "Eghbal, N." is @nayafia - Nadia, please don't feel any pressure to respond/comment, but I thought I'd tag you to let you know I'd written this. :)
Citation: Eghbal, N. (2016). Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure. Retrieved from the Ford Foundation website: http://www.fordfoundation.org/library/reports-and-studies/roads-and-bridges-the-unseen-labor-behind-our-digital-infrastructure
General thoughts
This is a great overview of the resource problems facing the open source community, and covers a lot of ground. Being an overview, it doesn't go into too much detail regarding any one problem. Its greatest strength is the number of individual projects and developers whose stories and frustrations are given voice by the report, many of which were new to me.
I'd recommend this to anyone who hasn't thought about open source sustainability in much depth but who is willing to take the time to read a 140-page report. For people with more knowledge, it's still worth reading, especially for the stories and quotes, but it doesn't have quite the level of value as it has for people just beginning to think about these topics.
Quotes & notes
Venture Capital
Eghbal has a background in venture capital, and so it's not surprising that the report contains insights into how VCs view open source. See:
Also:
As far as I can tell, there were no VCs quoted in this report. While I'm skeptical of VCs in general, I wish they'd been included.
Government Investment
While Eghbal details a variety of funding sources, including institutions like large corporations and software foundations, little time is spent on government investment in infrastructure. Given the huge role that government plays in physical infrastructure - something highlighted in the report - this speaks to a real failure on government's part. This is not to say that governments do not invest in open source at all - there are a number of projects like Tor and of course the internet and web themselves that have been publicly funded - but their impact is not as great as it could (arguably should) be. In some ways, governments are making things harder:
The burden on open source maintainers is of course one of the through lines of this report, but the report is inconsistent in drawing out what exactly these burdens are. Relatively early on, the author writes "Opening up a project to the public can mean less work for the company, which is essentially crowdsourcing improvements" (p. 47) but later on acknowledges that open source projects are often more work:
You could say that Eghbal is drawing a distinction between filing bugs and feature requests, and "contributing", but I'm not sure how useful a line that is to draw. (Nor am I sure this is Eghbal's actual intention.) Many bug reports and feature requests are incredibly valuable to maintainers, and many actual contributions cause more problems than they alleviate if they are poorly done, not a good match for the project direction, or simply too complex.
I think it's very important for us to explore what kinds of contributions (and contribution workflows, tools, and methodologies) are most beneficial to projects, but I don't think it will come down to "issues opened" vs "pull requests made".
Distributed management
Several of the stories in the report stress the value of distributed management:
Linux is a unique story, because between the creation of the Linux Foundation in 2007 and now there has been such a huge change in how the project is funded. Linux is arguably the best funded and most corporate of the open source projects out there. It seems like that the switch to more distributed management was necessary to allow so many different corporations to make large investments of money and employee time.
There's also the fascinating story of Node.js:
It seems that Node.js has also pursued a distributed management approach:
Eghbal cites this article: Healthy Open Source: A walkthrough of the Node.js Foundation’s base contribution policy., which I immediately added to my "to read" list.
That said, the Node.js community is mentioned as an unhealthy community in an unrelated section of the report:
Of course, the Node.js core could be beautifully maintained in a distributed fashion while the Node.js user community is splintering into chaos, but there's a tension here that I'd like to know more about.
High vs Low Quality Contributors
The report makes a useful distinction between what Eghbal calls "keystone contributors" and newcomers:
She quotes Hynek Schlawack, who puts the issue bluntly:
Eghbal endorses this view directly:
I'm sure there is a correlation between being new to open source and coding, and being more burdensome on open source maintainers, but I don't think newness or inexperience is the cause. The truth is that we have very little advice to give maintainers on how to manage technical, organizational and interpersonal complexity, or to give contributors about how to make contributions most productively. Newness is correlated with burdensomeness in large part because we're not teaching newcomers effectively, and we're not teaching newcomers effectively because we don't understand what's going on ourselves.
That said, I very much agree with Eghbal's proposal to identify "keystone contributors". We might not be able to articulate precisely why they're working so effectively, and how to help others reach that level of success, but we can and should support them in the meantime.
Final note on this topic: inexperienced newcomers are not the only burden on maintainers. Daniel Roy Greenfield calls out corporations -- even open source flagship corporations -- as being especially burdensome:
Towards the end of the report, Eghbal focuses in on the need for better metrics:
I absolutely agree on the importance of better quantifying and qualifying the open source community and its areas of particular need. There is some low-hanging fruit here, though. I've got a very small project that assesses whether a given repository has an active open source community: Should You Contribute? and Eghbal cites a study that uses the Github API. There's actually a lot of data available. The problem, to me, is definition -- I don't think we know what questions we want to ask, and without well-formed questions even the best data is useless.
Misc Thoughts
I had no idea this was a thing and I feel a tiny bit worse about the world now.
Eghbal cites Sam Gerstenzang for this distinction (specifically this tweet but this article talks at more length). This is a really interesting perspective and I'm grateful to Gerstenzang for articulating it and Eghbal for introducing me to it.
I'd like to know more about this incident, which I'd not heard about before.
Eghbal links to this source but it doesn't really explain Red Hat's patent/licensing situation, and I am deeply curious.
Quote from this wired article. Are open source projects that exist at the intersection of other projects particularly vulnerable to a lack of support? It might be a risk factor.
Agreed, and yet... if there are even greater bottlenecks, specifically complexity management issues, getting rid of these other bottenecks just increases pressure on the tightest one.
I agree that size is likely to be a big risk factor for under-resourced projects. That said, I'm sure it's not the only risk factor. Can we characterize who and what is struggling most in our current system?