WDscholia / scholia

Wikidata-based scholarly profiles
https://scholia.toolforge.org
Other
219 stars 78 forks source link

Submit an application to The Digital Infrastructure Incubator #1634

Closed Daniel-Mietchen closed 3 years ago

Daniel-Mietchen commented 3 years ago

as per https://blog.codeforscience.org/digital-infrastructure-incubator-is-live/

Daniel-Mietchen commented 3 years ago

I just submitted our application - here are their main questions and our responses:

Q1

Project Website (or Github or Twitter) *

A1

https://scholia.toolforge.org/

Q2

Statement of intention. Incubator participants will have an interest in developing and implementing transparent documentation, strategy, or other visioning around questions of sustainability, governance, and/or community health, or a related question. Please use this space to briefly describe a primary challenge that you perceive your organization faces now or in the near future and that you are interested in making a plan to address over the next 6 months. You might also articulate what you hope you/your project might get out of being a part of the Digital Infrastructure Incubator. (400-600 words)

A2

Background Scholia is a free and open-source web service for profiling elements of the scholarly research landscape. Using the open hosting environment Wikimedia Toolforge and visualizations of linked open data from Wikidata, it enables users to generate, access, share and collaboratively refine open scholarly profiles of various entity types, including authors, organizations, publishers, journals, events, awards, topics, genes, proteins, metabolic pathways, locations, clinical trials and others.

The underlying data are curated by a global community of tens of thousands of volunteers who collaborate across linguistic, disciplinary, cultural, jurisdictional and other boundaries to develop and maintain a multilingual and cross-disciplinary corpus of structured data and to integrate it with the ecosystems around Wikipedia as well as with public databases, especially scholarly ones like literature repositories. These volunteers are complemented and supported by an ecosystem of open resources: digital infrastructures, interactive tools and automated workflows that facilitate various forms of engagement with this community and its steadily growing corpus of currently 13 billion statements about roughly 100 million entities.

Scholia is one of these resources and serves as a prototype for similar initiatives. It has elements of an infrastructure, of an interactive tool and of automated workflows. For instance, it serves specific profiles linked from Wikipedia articles and provides information that can in turn assist with the creation and enrichment of such articles. Users can use Scholia to modify the queries underlying its visualizations, to navigate from one profile to another, or to cite a publication, and they can trigger the creation of Wikidata entries based on identifiers like DOIs, or navigate to yet other tools that Scholia has pre-configured to assist with exploring or curating Wikidata content related to the profiled entity. Scholia is popular at the intersection of the research and Wikipedia ecosystems, and we see potential for its future adoption as an integral part of the Wikipedia user experience.

Challenges Despite its popularity, the project overall is experiencing many of the routine difficulties of volunteer-run and crowdsourced projects in general. Our challenges include community management and governance, software development and content curation as well as documentation and sustainability of all that, especially in light of functionally overlapping yet non-open commercial competition.

Scholia's software originates from a small team of volunteers that has grown from originally one contributor to a current total of about two dozen, but only a small subset of them have made sustained contributions over multiple years, and documentation and infrastructure to welcome new contributors are not well developed yet. Moreover, since Scholia has no organizational backbone beyond its GitHub organization, it is challenging for the Scholia team to engage with the communities of its users and developers and even to accept support beyond code patches or reviews thereof. The core contributors have been fortunate to receive funding offers to expand the project but cannot be the ones to carry the project forward as staff developers, and transitioning from volunteer development to a mix of volunteers and paid contributors is not an obvious process.

Scholia's content, in turn, has many more contributors. Such access to crowdsourced contributions is a great opportunity, but this degree of community involvement creates global social expectations and ethical complexities at a scale that our small volunteer team cannot fully anticipate and may not have resources to address properly.

Expected outcomes Through the Digital Infrastructure Incubator, we expect the Scholia team to get in touch with other projects at similar transitions in their life cycle, to learn from their experience, to share ours and to be able to explore a broader range of potential avenues to address our challenges.

Q3

Evaluation of capacity. Participants will receive a one-time stipend of US$5,000 and we estimate that participants will spend 5-15 hours per month for 6 months in synchronous and asynchronous work. Please describe applicants' capacity to engage this block of time; use the stipend; et al. Things to consider include personnel, positions, interest, time commitments. (100-200 words) *

A3

The four core members of the Scholia team are all employed as researchers at academic institutions, and they are volunteer contributors to Wikimedia projects. Scholia bridges between our work and volunteering, but it is the focus of neither. As a team, we have a routine of 1-2 hours per week of synchronous interactions and a variable amount of asynchronous work addressing any aspect of the project.

The lead applicant can commit to rearranging asynchronous Scholia work to accommodate Incubator activities of 5-15 hours per month for 6 months and has experience handling and sustaining such additional commitments, e.g. through a decade of various degrees of freelancing or through volunteer activities like 100DaysOfJupyter, completed last year with daily public documentation.

Potential uses of the stipend as well as taxation depend on details of the attached stipulations. We would welcome the opportunity to use it towards moving the project forward with regard to openness-preserving organizational structures and business models, paving the way towards more structured community engagement, and laying the foundations for efficiently addressing administrative and accounting matters and similar issues.

Q4

Other! Anything else that you'd like to share about your work, your project's history, communities you engage with, or other information. Things to consider include previous efforts to address challenges articulated above, ways in which urgency of that challenge is perceived, exposure to collaborative problem solving around political questions, and so on. *

A4

While we have different academic backgrounds, speak different native languages and are living in different countries, we have all been contributing to both academia and Wikimedia projects for about a decade or more, and what brought us together is a shared interest in the intersection, of which Scholia and the broader initiative WikiCite to collaboratively collect and curate bibliographic and citation data are good examples.

We also all have experience in various organizational settings, both professionally and as volunteers, yet in light of the technical challenges the project is facing, our focus with respect to Scholia development has so far been predominantly technical in nature, as exemplified by a grant targeted at improving Scholia's technical robustness, which can be accessed via https://doi.org/10.3897/rio.5.e35820 .

This robustness has several dimensions affecting sustainability, with multiple dimensions of growth - feature set, underlying content, code base, usage and contributor communities - modulated by specific scalability issues in the context of fluctuating - and currently shrinking - resources. This is contrasted by established market players with partially overlapping offerings, for which they charge considerably, which provides them with ample resources and us with opportunities to give that market new momentum by providing an open alternative that establishes and democratizes baseline functionality. We are also aware that every choice we make - e.g. in terms of what to profile or not, what data to include or highlight or not, what workflows to facilitate or not, or how we go about organizational or community matters - has ethical implications, many of which are not straightforward to address. All of this contributes to a growth in complexity of the project.

As a result, organizational matters have long left their backseat and have reached the forefront of our attention. We are thus eager to address them systematically, e.g. by exploring potential combinations of jurisdiction, organization type and business models to support Scholia operations in a sustainable manner, picking a suitable arrangement and then setting things up accordingly.

Doing this exploration collaboratively and in the open is our preferred mode of action. This also means that we are not just concerned with the sustainability of Scholia itself but also of the ecosystem around it and how we can contribute to that, and we are looking forward to engage with the community that forms around the Incubator.

Daniel-Mietchen commented 3 years ago

Announcement tweet: https://twitter.com/EvoMRI/status/1432448681001865219 .

Daniel-Mietchen commented 3 years ago

We had a follow-up call yesterday with the organizers of the Incubator - decisions to be expected early October.

Daniel-Mietchen commented 3 years ago

We have been informed that Scholia has not been chosen for the Incubator cohort.