alan-turing-institute / hut23-open-source-sa

REG open source service area
0 stars 0 forks source link

Build a tool for analysing our contributions on GitHub #9

Open mhauru opened 1 year ago

mhauru commented 1 year ago

I would be interested in understanding REG's contributions to the OS ecosystem. We had a conversation about this with TPS people in July 23: Malvika, Arielle, Anne. Jim attended as well, as the REG TPS contact. They would likewise be interested in analysing Turing's contributions. I said we would work on a tool to pull some data from GitHub that could be useful for all parties.

The exact functionality isn't entirely clear to me yet, but the questions I would like answers to are things like:

Especially for the part of how we maintain our repos, Aoife was interested in this in the past. She started writing a tool for doing an analysis on this, that I think we could build on.

crangelsmith commented 1 year ago

This feels like an important and impactful project for the Turing, where REG could contribute meaningfully. However, people could feel that they are being monitored. It would be good to propose to TPS to take the project through an ethics approval process.

JimMadge commented 1 year ago

Some points about this:

mastoffel commented 1 year ago

The tool could have a feature for recognising dead repos and sending e-mails to the maintainers, eventually leading to archiving etc. of the repo. This could help to keep the the Turing repo clean. Also links to #11 .

mhauru commented 1 year ago

Once we get to this we should

AoifeHughes commented 1 year ago

For ref: https://github.com/alan-turing-institute/Hut23/issues/1458

rwood-97 commented 1 year ago

There is desire for this as part of the new building sustainably scholarly communities project, particularly re. contributions in to our repos from externals. The goal would be to use this data as a measure of success in terms of 'building a community'.

mhauru commented 1 year ago

I just had a chat with @yongrenjie about whether he thinks this should be built on top of whatwhat or as an extension of whatwhat. I would summarise his comments as (Jon please correct/add):

AoifeHughes commented 12 months ago

I’ve been meaning to pick up a bit of dev on the tool which Markus has already mentioned, for another (edi) reason. if someone did want to colab and get it modified to do what the OS SA wants then I’d be happy to do that 😊

mhauru commented 12 months ago

@AoifeHughes, the conclusion from the OS SA meeting this week was that this is a high priority for us (see emoji voting above), but right now we all have our hands full with things with deadlines. I do hope, maybe even expect, one or another of us to get to this Soon (TM).

mhauru commented 5 months ago

Work on going on a tool to fetch data here: https://github.com/alan-turing-institute/github-analyser and on analyses using that tool here: https://github.com/alan-turing-institute/github-analysis

llewelld commented 3 weeks ago

At RSECon24 there was a poster on the topic of "Mining RSE repository timelines on GitHub: How long will it live, and who will notice?" which reminded me of this task and generated what look to me like some really interesting results, e.g. "if you want stars, publish papers" and "if you want contributors, respond to their issues/PRs". These might sound obvious but I found it fascinating that this was shown in the data.

Anyway, mentioning it in case there's scope for sharing ideas. Kara — one of the authors — from EPCC was at RSECon and very open to talking about the approach.