As a consumer of tooling data I want to see updates to metadata merged into the master list when they are updated in source data.
Detailed Requirement
The current implementation of the build process is only an initial build that grabs all data from the in-scope repositories, merges and normalises based on some simple analytics and then stores in the docs directory. Whilst this is fine as a repeatable process, it doesn't make for long term state management of the Tooling repository i.e.:
If a tool disappears there's no archive.
The data quality cannot be reviewed and improved over time.
The hit on the GitHub API is such that the rate limits are quickly exceed.
We therefore need to implement a merge process that mines the source data as now and does any updates, but then only selectively hits the GitHub (or other repository when implemented) to update the statistics on the tools.
A suggested design approach to investigate is using the cache control directives available on GitHub and seeing whether we can only selectively hit the API when new metadata is available.
User Story
As a consumer of tooling data I want to see updates to metadata merged into the master list when they are updated in source data.
Detailed Requirement
The current implementation of the build process is only an initial build that grabs all data from the in-scope repositories, merges and normalises based on some simple analytics and then stores in the docs directory. Whilst this is fine as a repeatable process, it doesn't make for long term state management of the Tooling repository i.e.:
We therefore need to implement a merge process that mines the source data as now and does any updates, but then only selectively hits the GitHub (or other repository when implemented) to update the statistics on the tools.
A suggested design approach to investigate is using the cache control directives available on GitHub and seeing whether we can only selectively hit the API when new metadata is available.