Open jayvdb opened 6 years ago
The extracted data should be cached in the generated website, like the GCI data, with correct timestamps, and only regenerated if the data is missing.
The fact that no data was found also needs to be cached, to avoid running the scraper in every build on non-gsoc orgs.
See also https://gitlab.com/coala/GSoC-2017/
@jayvdb Please assign this to me.
In 2017, coala was gsoc org 5817061024464896.
As this repository is generic, the tool must find the project identifier using only the
org_name
which is exposed in thecommunity
app.Scrape the list of org projects from that page , e.g. 5154725527814144, pull in relevant data, such as student and mentor display name.
The scraper must be part of the django system so that any GSOC org can load their 2017 data using the scraper.