Open vsoch opened 4 years ago
I kinda of forgot about this one. It is actually a great plan. I will seek to provide a list of projects I find interesting for us and from there we can test locally maybe and provide some PRs. I think with urlchecker-python
, this is simple to do and in the PRs we can maybe mention urlchecker-action
.
This was actually one of the small hacks I thought of to improve my contribution records on GitHub, but I thought that it can be bothering to some if every now and then some random developer shows up with a bunch of broken urls.
I am not sure if I understand your idea of a CI automated job to test over longer period of times? Isn't that the job of the GitHub action?
I will seek to provide a list as soon as I can, which shouldn't be complicated because every project with links or documentation urls is of interest to us. However, I think the list should be more dependant on who are more open to the feedback and who might adopt our tool.
One thing I would love to to explore more in the upcoming days is the badge. There is one in the readme me here https://github.com/urlstechie/urlstechie.github.io but it is static (I think). If we manage to make that dynamic and dependant on the last build (something like travis-ci badges) that might propel things for the project because badges are trending these days and they wrap the results beautifully.
I am not sure if I understand your idea of a CI automated job to test over longer period of times? Isn't that the job of the GitHub action?
@SuperKogito let's say that we have a list of repos - we would have some repository, let's call it "urlchecker-analysis" that uses the GitHub action:
So you can imagine we would have a results structure something like this:
# urlchecker-analysis
results/
repo-checked-1 # this might be the research meeting list repo, for example
results-<date-1>.csv
results-<date-2>.csv
....
repo-checked-2
...
repo-checked-n
And then you can imagine having an analysis script that can be run over any specific repository checked, and say things like "The percentage of urls broken on average is... the change from week to week is..." and more importantly, if we get enough repos, we might even be able to say things in a larger sense like "We found repos associated with this domain, or repos that were updated only this many times, had significantly more broken links." And of course that requires having metadata about the repos, which is something else we can get from the GitHub API, etc. But that's a later step, we can focus on first:
And then we can play around with developing the analysis bit when there is a tiny bit of data. I suspect that most repos won't have huge changes day to day, which is why I'm thinking the rate of monthly might be a good start.
And then once we have this analysis, we can write it up, make pretty plots, and give good reason to do the checks in the first place!
For the badges - definitely give it a go! Please again open feature branches for review first. I've made custom badges (I think with shields.io?) Here are a few purple ones I designed for the needs-love project :) https://github.com/rseng/needs-love
urlchecker-analysis, I love it. The whole concept, that's a genius idea <3 I will see which repository urls we can use :)
awesome! If you want to put together a first shot at a list, I can put together the skeleton of the repo (I've already thought about it a bit).
go on with the repo and I will add a list to it? or maybe better to put it here? I will try to make it, at the latest by tomorrow.
Just put it here since we have the nice issue :)
Actually even better - I can make the repo and transfer the issue! <3
Done!
So after searching a bit and checking some projects, I came up with the list below. The projects listed below were not chosen for any specific criteria. I just tried to diversify the repositories (Python, JS, Html) but it is still missing others (c., c++ etc.). I also tried to include projects that are currently maintained and include many links.
This a list of various active projects of interest with many links.
This will help us test urlchecker with .md files
Let me know what you think of it and which ones we should add ;)
These are great! I don't see why we shouldn't add all of them? It's a very nice range of types of repos.
I need to finish up working on an API, but after that I should be able to put some time into this! If not today, definitely this week.
From issue urlstechie/urlchecker-python#13:
This sounds like fun! I'm totally willing to take on the bulk of work stated above, I haven't done a little fun project like this in a while. Let me know your thoughts!