INRIA / spoon

Spoon is a metaprogramming library to analyze and transform Java source code. :spoon: is made with :heart:, :beers: and :sparkles:. It parses source files to build a well-designed AST with powerful analysis and transformation API.
http://spoon.gforge.inria.fr/
Other
1.75k stars 349 forks source link

backup Spoon Github data (issue and pull request) #4714

Open monperrus opened 2 years ago

monperrus commented 2 years ago

Recently, Github has blocked or deleted entire projects for political reasons.

Just in case, we should backup all our Github data which is important development and process knowledge.

We would cron this and push a tarball somewhere.

slarse commented 2 years ago

Recently, Github has blocked or deleted entire projects for political reasons.

Source on this? I can't find anything.

algomaster99 commented 2 years ago

@slarse See https://www.bleepingcomputer.com/news/security/github-suspends-accounts-of-russian-devs-at-sanctioned-companies/.

For example, the GitHub accounts of Sberbank Technology, Sberbank AI Lab, and the Alfa Bank Laboratory had their code repositories initially disabled and are now removed from the platform.

Personal accounts suspended on GitHub have their content wiped while all repositories become immediately out of reach, and the same applies to issues and pull requests

slarse commented 2 years ago

I see, based on the description in the opening post my search terms were going in a completely different direction.

I don't think the suspension of those accounts can be counted as politically motivated, they're mandated by sanctions that GitHub/Microsoft are law-bound to abide by. The suspensions of private accounts is perhaps more of a gray area that I'm not about to debate, but based on the facts in that article I think it's safe to say all of the suspensions were driven by the sanctions. While the sanctions are of course political in a sense, a company abiding by them simply is not, it's just following rules set by a governing body. Whether or not GitHub went beyond the call of duty is a different matter.

That was a bit of a tangent, but I think framing an issue with appropriate information is important.

Anyway, I don't disagree with the actionable part of this issue. The article suggests that there are situations in which GitHub accounts can go poof, so having all information we care about in a place that we control makes sense. Actually, the latter makes sense regardless of if accounts have previously gone poof or not; if you don't own the server the data is stored on you don't really have the data.

The GitHub data I can think of to back up is issues and pull requests. At least issues are easy to export to CSV using the GitHub CLI. A bit uncertain if that includes comments, though. If we need something we can't do with the GH CLI, we can use something like PyGitHub to access the API directly.