Open Ly0n opened 10 months ago
I've just realised that we could automate this very easily. To do this, we would have to automatically download the images from READMEs and the documentation. This would allow us to view all the image material within a few hours and search for relevant or visually appealing images for various projects. This could be done with the simple data mining script in the AwesomeCure repo.
I whipped up a quick page to visualize all the images in the readmes: https://ost.ecosyste.ms/projects/images (⚠️ total download size of that page is over 500mb, don't attempt to open it on mobile) (i've added loading="lazy"
to the img tags which should help reduce the default size)
Commit over here if you want to see the implementation: https://github.com/ecosyste-ms/ost/commit/0d1f83ef717ddb2edbb0592ecf86b720af192178
There's also an API endpoint listing all the urls by project (over 6000 total): https://ost.ecosyste.ms/api/v1/projects/images
Breakdown of most popular image hosts: https://gist.github.com/andrew/885bc2a8a10cddefb9efac8260223b82
github.com 1836
img.shields.io 1432
avatars.githubusercontent.com 362
zenodo.org 308
codecov.io 244
raw.githubusercontent.com 243
www.r-pkg.org 144
readthedocs.org 144
user-images.githubusercontent.com 129
cranlogs.r-pkg.org 128
Breakdown of most used extensions: https://gist.github.com/andrew/0280c7c6e7e92bb4866dc0e46ee7d492
.svg 2910
1728
.png 1343
.jpg 175
Further work could be to download all urls to check size, dimension and file type, could also analyses the paths of the various badge urls to extra metadata.
I'll take some time over the Christmas to think about what we can do with this. The data can be used to show the role of software, data and technology in environmental sustainability in a very accessible and visual way. This could be very helpful in reaching people outside of the open source industry to emphasise the central importance of open source in the area of climate change / sustainability. In my experience, most people have no idea why (open source) software is so important in this area. Here are the ideas I've had so far:
Many of the projects listed here present environmental sustainability data in an impressive way. This visualisation are perfect for reports and presentation. Therefore, here is an issue where we can capture such images centrally: