executablebooks / github-activity

Simple markdown changelogs for GitHub repositories
https://github-activity.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
47 stars 11 forks source link

Consider using github-to-sqlite to grab our activity dataset #76

Open choldgraf opened 1 year ago

choldgraf commented 1 year ago

Context

This tool is basically a two-step process.

However, the functionality in step 1 is kind-of hacky and messy, and hard to reason with.

I recently came across a tool recommended by @simonw , which essentially replicates all of this functionality but with a more well-structured and maintainer implementation:

This is a python library that will grab all of the issues, pull requests, and comments (among other things) from a repository and store them in a local sqlite database so that you can do what you want with them. They are structured to be able to work with datasette as well (though we may not have use for that in this package, just FYI).

Two questions that I have and I'm not sure the answer:

Proposal

What do folks think about re-using github-to-sqlite for our "grab all of the activity in a repository" step, and focusing this repository on the munging / filtering by date / calculating statistics / generating markdown aspects?

I think this might be a nice way to reduce some unnecessary complexity here and to re-use code from others in the ecosystem. I also like the idea of becoming familiar with datasette structures as is opens the possibility that we could expose this kind of data in the future for others in the community to munge and use.

At this point I'm just exploring the idea and curious what others think!

Tasks and updates

No response